RE: facet sort by ranking
Hi, We having 100 category and each category having it own internal ranking. Let consider if I search for any product and its fall under 30 categories and we are showing top 10 categories in filter so that user can filter there results. Let consider hypothetical example(as we don't have correct data and we are under testing solr features): Categories values and internal ranking: Cat1 - 1 Cat2 - 2 Cat3 - 3 Cat4 - 4 Cat5 - 5 Cat6 - 6 Cat7 - 7 Cat8 - 8 Cat9 - 9 Cat10 - 10 Cat11 - 11 Cat12 - 12 Cat13 - 13 Cat14 - 14 Cat15 - 15 If I search for product it will return result: Category count(as sort by count) Cat2 - 20 Cat3 - 17 Cat4 - 15 Cat1 - 14 Cat7 - 13 Cat8 - 12 Cat9 - 10 Cat15 - 9 Cat13 - 8 Cat10 - 7 Cat11 - 6 Cat12 - 5 Now we want show only top 10 values so we will miss: Cat11 and Cat12 as it sort by count not by its ranking We would like result below : Cat15 Cat13 Cat12 Cat11 Cat10 Cat9 Cat8 Cat7 Cat4 Cat3 Cat2 Cat1 Hope this will convey what we want Have great day .:) Thanks and Regards, Amit -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: 22 November 2008 22:51 To: solr-user@lucene.apache.org Subject: Re: facet sort by ranking On Sat, Nov 22, 2008 at 12:05 PM, Amit <[EMAIL PROTECTED]> wrote: > Actually we have some ranking associated to field on which we are faceting > and we want to show only top 10 facet value now which is sort by count but > we want to sort by it ranking. I think you're going to have to give some concrete examples of what your documents look like, and what results you want back. -Yonik No virus found in this incoming message. Checked by AVG. Version: 7.5.549 / Virus Database: 270.9.9/1804 - Release Date: 21-11-2008 18:24 No virus found in this outgoing message. Checked by AVG. Version: 7.5.549 / Virus Database: 270.9.9/1804 - Release Date: 21-11-2008 18:24
Query for Distributed search -
Hi, Looking for some insight on distributed search. Say I have an index distributed in 3 boxes and the index contains time and text data (typical log file). Each box has index for different timeline - say Box 1 for all Jan to April, Box 2 for May to August and Box 3 for Sep to Dec. Now if I try to search for a text string, will the search would happen in parallel in all 3 boxes or sequentially? Regards, Sourav CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS***
Re: Please Help !! Question about Query Phrase Slop (qs) in dismax
If you boost the phrase queries by enough, you could tell when you hit the less relevant documents by the score. -Yonik On Mon, Nov 24, 2008 at 12:07 AM, anuvenk <[EMAIL PROTECTED]> wrote: > > Thanks for the response. Well my current ps setting works great for most > search terms. But say this typical example, north dakota 1031 exchange > lawyers - we don't have any relevant docs in the index. Solr is returning > the irrelevant doc, just because it found 'lawyer', exchange, north & dakota > somewhere. I thought if there is a way to just not return any results if > they are not within close proximity, it would be great. > > Yonik Seeley wrote: >> >> On Sun, Nov 23, 2008 at 11:51 PM, anuvenk <[EMAIL PROTECTED]> >> wrote: >>> Please help someone...i've been waiting for an answer for the last couple >>> of >>> days & no one seems to be helping out here. I did search the wiki & this >>> forum for an answer. But couldn't find an answer. I know if ps is set to >>> 5 >>> words within 5 words of one another receive a boost in score. But is >>> there a >>> way to not return results that have the words in search terms more than 5 >>> words apart. ? >> >> Not with dismax. I'm not sure why it's a problem, given that with >> enough boost you should be able to ensure that all of the results with >> a slop less than 5 appear before other results. >> Anyway, if you want to restrict results to those with a slop of 5, use >> the standard query parser with an explicit sloppy phrase query: >> >> "north dakota 1031 exchange lawyers"~5 >> >> -Yonik >> >> >>> Typical example: north dakota 1031 exchange lawyers >>> My first result is absolutely ir-relevant. It returned a north dakota doc >>> though but had an occurrence of attorney somewhere & an occurrence of >>> exchange (not related to 1031 exchange though). They were not within 5 >>> words >>> of one another. My guys have been hammering me reg this relevancy issue. >>> Please help someone. >>> >>> anuvenk wrote: From the solr wiki, it sounded like if qs is set to 5 for example, & if the search term is 'child custody', only docs with 'child' & 'custody' within 5 words of one another would be returned in results. Is this correct? If so, it doesn't seem to be working for me. I see docs with 'child' & 'custody' more than 5 words of one another (excluding stop words) which is resulting in bad user experience as those docs are not so relevant. What more could i do to improve quality in the results? >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/Please-Help-%21%21-Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20654906.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >>> >>> >> >> > > -- > View this message in context: > http://www.nabble.com/Please-Help-%21%21-Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20655014.html > Sent from the Solr - User mailing list archive at Nabble.com. > >
Re: Please Help !! Question about Query Phrase Slop (qs) in dismax
Thanks for the response. Well my current ps setting works great for most search terms. But say this typical example, north dakota 1031 exchange lawyers - we don't have any relevant docs in the index. Solr is returning the irrelevant doc, just because it found 'lawyer', exchange, north & dakota somewhere. I thought if there is a way to just not return any results if they are not within close proximity, it would be great. Yonik Seeley wrote: > > On Sun, Nov 23, 2008 at 11:51 PM, anuvenk <[EMAIL PROTECTED]> > wrote: >> Please help someone...i've been waiting for an answer for the last couple >> of >> days & no one seems to be helping out here. I did search the wiki & this >> forum for an answer. But couldn't find an answer. I know if ps is set to >> 5 >> words within 5 words of one another receive a boost in score. But is >> there a >> way to not return results that have the words in search terms more than 5 >> words apart. ? > > Not with dismax. I'm not sure why it's a problem, given that with > enough boost you should be able to ensure that all of the results with > a slop less than 5 appear before other results. > Anyway, if you want to restrict results to those with a slop of 5, use > the standard query parser with an explicit sloppy phrase query: > > "north dakota 1031 exchange lawyers"~5 > > -Yonik > > >> Typical example: north dakota 1031 exchange lawyers >> My first result is absolutely ir-relevant. It returned a north dakota doc >> though but had an occurrence of attorney somewhere & an occurrence of >> exchange (not related to 1031 exchange though). They were not within 5 >> words >> of one another. My guys have been hammering me reg this relevancy issue. >> Please help someone. >> >> anuvenk wrote: >>> >>> From the solr wiki, it sounded like if qs is set to 5 for example, & if >>> the search term is 'child custody', only docs with 'child' & 'custody' >>> within 5 words of one another would be returned in results. Is this >>> correct? If so, it doesn't seem to be working for me. I see docs with >>> 'child' & 'custody' more than 5 words of one another (excluding stop >>> words) which is resulting in bad user experience as those docs are not >>> so >>> relevant. What more could i do to improve quality in the results? >>> >> >> -- >> View this message in context: >> http://www.nabble.com/Please-Help-%21%21-Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20654906.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/Please-Help-%21%21-Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20655014.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Please Help !! Question about Query Phrase Slop (qs) in dismax
On Sun, Nov 23, 2008 at 11:51 PM, anuvenk <[EMAIL PROTECTED]> wrote: > Please help someone...i've been waiting for an answer for the last couple of > days & no one seems to be helping out here. I did search the wiki & this > forum for an answer. But couldn't find an answer. I know if ps is set to 5 > words within 5 words of one another receive a boost in score. But is there a > way to not return results that have the words in search terms more than 5 > words apart. ? Not with dismax. I'm not sure why it's a problem, given that with enough boost you should be able to ensure that all of the results with a slop less than 5 appear before other results. Anyway, if you want to restrict results to those with a slop of 5, use the standard query parser with an explicit sloppy phrase query: "north dakota 1031 exchange lawyers"~5 -Yonik > Typical example: north dakota 1031 exchange lawyers > My first result is absolutely ir-relevant. It returned a north dakota doc > though but had an occurrence of attorney somewhere & an occurrence of > exchange (not related to 1031 exchange though). They were not within 5 words > of one another. My guys have been hammering me reg this relevancy issue. > Please help someone. > > anuvenk wrote: >> >> From the solr wiki, it sounded like if qs is set to 5 for example, & if >> the search term is 'child custody', only docs with 'child' & 'custody' >> within 5 words of one another would be returned in results. Is this >> correct? If so, it doesn't seem to be working for me. I see docs with >> 'child' & 'custody' more than 5 words of one another (excluding stop >> words) which is resulting in bad user experience as those docs are not so >> relevant. What more could i do to improve quality in the results? >> > > -- > View this message in context: > http://www.nabble.com/Please-Help-%21%21-Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20654906.html > Sent from the Solr - User mailing list archive at Nabble.com. > >
Re: Please Help !! Question about Query Phrase Slop (qs) in dismax
Please help someone...i've been waiting for an answer for the last couple of days & no one seems to be helping out here. I did search the wiki & this forum for an answer. But couldn't find an answer. I know if ps is set to 5 words within 5 words of one another receive a boost in score. But is there a way to not return results that have the words in search terms more than 5 words apart. ? Typical example: north dakota 1031 exchange lawyers My first result is absolutely ir-relevant. It returned a north dakota doc though but had an occurrence of attorney somewhere & an occurrence of exchange (not related to 1031 exchange though). They were not within 5 words of one another. My guys have been hammering me reg this relevancy issue. Please help someone. anuvenk wrote: > > From the solr wiki, it sounded like if qs is set to 5 for example, & if > the search term is 'child custody', only docs with 'child' & 'custody' > within 5 words of one another would be returned in results. Is this > correct? If so, it doesn't seem to be working for me. I see docs with > 'child' & 'custody' more than 5 words of one another (excluding stop > words) which is resulting in bad user experience as those docs are not so > relevant. What more could i do to improve quality in the results? > -- View this message in context: http://www.nabble.com/Please-Help-%21%21-Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20654906.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Newbie Question - getting search results from dataimport request handler
On Mon, Nov 24, 2008 at 7:25 AM, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > : > Logging an error and returning successfully (without adding any docs) is > : > still inconsistent with the way all other RequestHandlers work: fail the > : > request. > : > > : > I know DIH isn't a typical RequestHandler, but some things (like failing > : > on failure) seem like they should be a given. > : SOLR-842 . > : DIH is an ETL tool pretending to be a RequestHandler. Originally it > : was built to run outside of Solr using SolrJ. For better integration > : and ease of use we changed it later. > : > : SOLR-853 aims to achieve the oroginal goal > : > : The goal of DIH is to become a full featured ETL tool. > > Understood ... but shouldn't ETL Tools "fail on failure" ? > > I mean forget Solr for a minute: If i've got a standalone ETL Tool that > runs as a daemon, and on startup it logs some error messages because i've > got bad configs (and it can tell the fields i've listed for my > 'target' system don't exist there) should it report "success" everytime i > push data to it? > > Based on this thread, that's what it sounds like DIH is doing right now in > situations like this. > > If nothing else, we could give DIH a way to check the global > value from solrconfig.xml and make it's > decisison that way We considered these. The severity of errors are very much specific to the source of data. It is very unlikely that a DB source throws up errors. In xml data sources say out of x urls 1 or two are wrong, would the user wish to ignore or want to abort the entire import. So we decided to give more options and the implementations are left to the EntityProcessor. Moreover the default is set to onError=abort > > > > -Hoss > > -- --Noble Paul
Re: Using Solr for indexing emails
On Sun, 23 Nov 2008 16:02:16 +0200 Timo Sirainen <[EMAIL PROTECTED]> wrote: > Hi, Hi Timo, > [...] > The main problem is that before doing the search, I first have to check > if there are any unindexed messages and then add them to Solr. This is > done using a query like: > - fl=uid > - rows=1 > - sort=uid desc > - q=uidv: box: user: So, if I understand correctly, the process is : 1. user sends search query Q to search interface 2. interface checks highest indexed uidv in SOLR 3. checks in IMAP store for mailbox if there are any objects ('emails') newer than uidv from 2. 4. anything found in 3. is processed, submitted to SOLR, committed. 5. interface submits search query Q to index, gets results 6. results are presented / returned to user It strikes me that this may work ok in some situations but may not scale. I would decouple the {find new documents / submit / commit } process from the { search / presentation} layer - SPECIALLY if you plan to have several mailboxes in play now. > So it returns the highest IMAP UID field (which is an always-ascending > integer) for the given mailbox (you can ignore the uidvalidity). I can > then add all messages with higher UIDs to Solr before doing the actual > search. > > When searching multiple mailboxes the above query would have to be sent > to every mailbox separately. hmm...not sure what you mean by "query would have to be sent to every MAILBOX" ... > That really doesn't seem like the best > solution, especially when there are a lot of mailboxes. But I don't > think Solr has a way to return "highest uid field for each > box:"? hmmm... maybe you can use facets on 'box' ... ? though you'd still have to query for each box, i think... > Is that above query even efficient for a single mailbox? i don't think so. >I did consider > using separate documents for storing the highest UID for each mailbox, > but that causes annoying desynchronization possibilities. Especially > because currently I can just keep sending documents to Solr without > locking and let it drop duplicates automatically (should be rare). With > per-mailbox highest-uid documents I can't really see a way to do this > without locking or allowing duplicate fields to be added and later some > garbage collection deleting all but the one highest value (annoyingly > complex). I have a feeling the issues arise from serialising the whole process (as I described above... ). It makes more sense (to me) to implement something similar to DIH, where you load data as needed (even a 'delta query', which would only return new data... I am not sure whether you could use DIH ( RSS feed from IMAP store? ) > I could of course also keep track of what's indexed on Dovecot's side, > but that could also lead to desynchronization issues and I'd like to > avoid them. > > I guess the ideal solution would be if it was somehow possible to create > a SQL-like trigger that updates the per-mailbox highest-uid document > whenever adding a new document with a higher UID value. I am not sure how much effort you want to put into this...but I would think that writing a lean app that periodically (for a period that makes sense for your hardware and user's expectation... 5 minutes? 10? 1? ) crawls the IMAP stores for UID, processes them and submits to SOLR, and keeps its own state ( dbm or sqlite ) may be a more flexible approach. Or, if dovecot support this, a 'plugin / hook ' that sends a msg to your indexing app everytime a new document is created. I am interested to hear what you decide to go with, and why. cheers, B _ {Beto|Norberto|Numard} Meijome "All parts should go together without forcing. You must remember that the parts you are reassembling were disassembled by you. Therefore, if you can't get them together again, there must be a reason. By all means, do not use hammer." IBM maintenance manual, 1975 I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Wait Flush, Wait Searcher and commit Scenarios
On Tue, Nov 18, 2008 at 10:55 PM, Mark Miller <[EMAIL PROTECTED]> wrote: > Does waitFlush do anything now? I only see it being set if eclipse is not > missing a reference... Not currently. The idea was that if waitFlush== false that the call would be totally asynchronous and return immediately. If waitFlush==true, then the call would return only after everything was flushed to stable storage (which is always the case now). -Yonik p.s. late replies since I'm getting back from a week of travel.
RE: [VOTE] Community Logo Preferences
https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png https://issues.apache.org/jira/secure/attachment/12394353/solr.s5.jpg https://issues.apache.org/jira/secure/attachment/12394265/apache_solr_b_blue.jpg https://issues.apache.org/jira/secure/attachment/12394167/solrlogo.jpg https://issues.apache.org/jira/secure/attachment/12394376/solr_sp.png - Vinu -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Sunday, November 23, 2008 10:30 PM To: solr-user@lucene.apache.org Subject: [VOTE] Community Logo Preferences Please submit your preferences for the solr logo. For full voting details, see: http://wiki.apache.org/solr/LogoContest#Voting The eligible logos are: http://people.apache.org/~ryan/solr-logo-options.html Any and all members of the Solr community are encouraged to reply to this thread and list (up to) 5 ranked choices by listing the Jira attachment URLs. Votes will be assigned a point value based on rank. For each vote, 1st choice has a point value of 5, 5th place has a point value of 1, and all others follow a similar pattern. https://issues.apache.org/jira/secure/attachment/12345/yourfrstchoice.jpg https://issues.apache.org/jira/secure/attachment/34567/yoursecondchoice.jpg ... This poll will be open until Wednesday November 26th, 2008 @ 11:59PM GMT When the poll is complete, the solr committers will tally the community preferences and take a final vote on the logo. A big thanks to everyone would submitted possible logos -- its great to see so many good options.
Re: [VOTE] Community Logo Preferences
https://issues.apache.org/jira/secure/attachment/12394366/solr3_maho.png https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png https://issues.apache.org/jira/secure/attachment/12392306/apache_solr_sun.png https://issues.apache.org/jira/secure/attachment/12394267/apache_solr_c_blue.jpg Good work to all the people who contributed. -Nick On Mon, Nov 24, 2008 at 3:06 PM, Norberto Meijome <[EMAIL PROTECTED]> wrote: > On Sun, 23 Nov 2008 11:59:50 -0500 > Ryan McKinley <[EMAIL PROTECTED]> wrote: > >> Please submit your preferences for the solr logo. > > https://issues.apache.org/jira/secure/attachment/12394267/apache_solr_c_blue.jpg > https://issues.apache.org/jira/secure/attachment/12394263/apache_solr_a_blue.jpg > https://issues.apache.org/jira/secure/attachment/12394070/sslogo-solr-finder2.0.png > https://issues.apache.org/jira/secure/attachment/12394376/solr_sp.png > https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_a_red.jpg > > thanks!! > B > > _ > {Beto|Norberto|Numard} Meijome > > "Tell a person you're the Metatron and they stare at you blankly. Mention > something out of a Charleton Heston movie and suddenly everyone's a Theology > scholar!" > Dogma > > I speak for myself, not my employer. Contents may be hot. Slippery when wet. > Reading disclaimers makes you go blind. Writing them is worse. You have been > Warned. >
Re: How can i protect the SOLR Cores?
: 1) modify web.xml (part of the sources of solr.war, which you'll have to : rebuild) to define the authentication constraints you want. for many servlet containers, this isn't neccessary. Jetty cor example also lets you define security realms in the jetty.xml (there's an example of this commented out in the example jetty.xml) -Hoss
Re: [VOTE] Community Logo Preferences
On Sun, 23 Nov 2008 11:59:50 -0500 Ryan McKinley <[EMAIL PROTECTED]> wrote: > Please submit your preferences for the solr logo. https://issues.apache.org/jira/secure/attachment/12394267/apache_solr_c_blue.jpg https://issues.apache.org/jira/secure/attachment/12394263/apache_solr_a_blue.jpg https://issues.apache.org/jira/secure/attachment/12394070/sslogo-solr-finder2.0.png https://issues.apache.org/jira/secure/attachment/12394376/solr_sp.png https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_a_red.jpg thanks!! B _ {Beto|Norberto|Numard} Meijome "Tell a person you're the Metatron and they stare at you blankly. Mention something out of a Charleton Heston movie and suddenly everyone's a Theology scholar!" Dogma I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: WordDelimeterFilter and its Factory: access to charTypeTable
: I was wondering if it is possible to access and modify the charTypeTable : of the WordDelimeterFilter. FWIW: WordDelimeterFilter has a static package protected defaultWordDelimTable but there is no need to modify it -- you can pass your own charTypeTable directly to the WordDelimeterFilter constructor ... this might mean writing your own Factory, but you don't need to muck with the guts of WDF itself. -Hoss
Re: not string or text fields and shards
On Thu, Nov 20, 2008 at 7:41 AM, Marc Sturlese <[EMAIL PROTECTED]> wrote: > I have started working with an index divided in 3 shards. When I did a > distributed search I got an error with the fields that were not string or > text. I read that the error was due to BinaryResponseWriter and not > string/text empty fields. I think it's more the case that if you have an invalid field value, it could blow up at different points in different code paths. The root cause is still an invalid value in the field. -Yonik
Re: Newbie Question - getting search results from dataimport request handler
: > Logging an error and returning successfully (without adding any docs) is : > still inconsistent with the way all other RequestHandlers work: fail the : > request. : > : > I know DIH isn't a typical RequestHandler, but some things (like failing : > on failure) seem like they should be a given. : SOLR-842 . : DIH is an ETL tool pretending to be a RequestHandler. Originally it : was built to run outside of Solr using SolrJ. For better integration : and ease of use we changed it later. : : SOLR-853 aims to achieve the oroginal goal : : The goal of DIH is to become a full featured ETL tool. Understood ... but shouldn't ETL Tools "fail on failure" ? I mean forget Solr for a minute: If i've got a standalone ETL Tool that runs as a daemon, and on startup it logs some error messages because i've got bad configs (and it can tell the fields i've listed for my 'target' system don't exist there) should it report "success" everytime i push data to it? Based on this thread, that's what it sounds like DIH is doing right now in situations like this. If nothing else, we could give DIH a way to check the global value from solrconfig.xml and make it's decisison that way. -Hoss
RE: Updating schema.xml without deleting index?
: of myfield as the same result. I wish there was an option to just : completely reindex all data..i suppose optimize may do that a little : bit? "optimize" is just a low level lucene call to purge all deleted docs and merge all index segments into a single segment. and there is an option to reindex all data: take whatever you used to index in the data the first time, and do it again. :) seriously though, if you use something like DateImportHandler this is fairly easy, if you don't use something like DIH, it's a matter of designing whatever system you do use so that it's easy do reindex later as needed (unless you're certain that your schema is perfect and never needs to change) The way you solved your use case (exclude things that don't have a value) is exactly how i go about deal with situations like this routinely. -Hoss
Re: [VOTE] Community Logo Preferences
https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png https://issues.apache.org/jira/secure/attachment/12394475/solr2_maho-vote.png https://issues.apache.org/jira/secure/attachment/12394268/apache_solr_c_red.jpg On Sun, Nov 23, 2008 at 10:59 AM, Ryan McKinley <[EMAIL PROTECTED]> wrote: > Please submit your preferences for the solr logo. > > For full voting details, see: > http://wiki.apache.org/solr/LogoContest#Voting > > The eligible logos are: > http://people.apache.org/~ryan/solr-logo-options.html > > Any and all members of the Solr community are encouraged to reply to this > thread and list (up to) 5 ranked choices by listing the Jira attachment > URLs. Votes will be assigned a point value based on rank. For each vote, 1st > choice has a point value of 5, 5th place has a point value of 1, and all > others follow a similar pattern. > > https://issues.apache.org/jira/secure/attachment/12345/yourfrstchoice.jpg > https://issues.apache.org/jira/secure/attachment/34567/yoursecondchoice.jpg > ... > > This poll will be open until Wednesday November 26th, 2008 @ 11:59PM GMT > > When the poll is complete, the solr committers will tally the community > preferences and take a final vote on the logo. > > A big thanks to everyone would submitted possible logos -- its great to see > so many good options. -- http://fak3r.com dim high beams for oncoming traffic http://lefttochance.com know your rights, don't lose them
Re: filtering on blank OR specific range
: I'm having difficultly filtering my documents when a field is either : blank or set to a specific value. I would have thought this would work : : fq=-Type:[* TO *] OR Type:blue Rule#1 don't try to mix AND/OR syntax with +/- syntax ... it never works the way you want. "a OR b" is just syntactic sugar for "a b" ... "-a OR b" is equivilent to "-a b" ... if you use debugQuery=true and look at the parsed_filter_queries you'll see that your fq is being parsed as... -Type:[* TO *] Type:blue ...looking at it that way, odes it make sense why it doesn't match any documents? there is only one "positive" clause, which is that Type == blue. But then you are excluding any docs where Type has a value, so you get the empty set. you could have a special "Type_empty" boolean field and use... fq = Type_empty:true Type:blue ...or you can play tricks with the syntax, and do something like this... fq = (*:* -Type:[* TO *]) Type:blue -Hoss
Re: [VOTE] Community Logo Preferences
https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png https://issues.apache.org/jira/secure/attachment/12394266/apache_solr_b_red.jpg
Re: [VOTE] Community Logo Preferences
https://issues.apache.org/jira/secure/attachment/12394267/apache_solr_c_blue.jpg https://issues.apache.org/jira/secure/attachment/12394268/apache_solr_c_red.jpg https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png https://issues.apache.org/jira/secure/attachment/12394366/solr3_maho.png https://issues.apache.org/jira/secure/attachment/12393936/logo_remake.jpg
Re: [VOTE] Community Logo Preferences
https://issues.apache.org/jira/secure/attachment/12394218/solr-solid.png https://issues.apache.org/jira/secure/attachment/12394376/solr_sp.png https://issues.apache.org/jira/secure/attachment/12393951/sslogo-solr-classic.png https://issues.apache.org/jira/secure/attachment/12391946/apache_solr_burning.png https://issues.apache.org/jira/secure/attachment/12392306/apache_solr_sun.png - Mark
Re: Pagination with Solr
ok! gracias ryguasu por tu respuesta, mira que ahora que recuerdo si hay un setStart y setRows trataré con eso y espero poder terminar mi proyecto, 1000 gracias =) -- View this message in context: http://www.nabble.com/Pagination-with-Solr-tp13847908p20650529.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: QueryElevationComponent
On Nov 23, 2008, at 3:06 PM, Paolo Ruscitti wrote: Thanks Ryan for your answer. The only thing that may be weird is that if you id field is named "myid", your elevate.xml file still refers to "id" as the unique key. Is that what you are refering to? yes, my id field is named "myid", but elevate.xml expects its name is "id" . Please find below more info: I' using the very last revision (720030) I also tried both As Ryan said, that is incorrect - it must be id="..." regardless of what your uniqueKey field is. remove "myid:" from that value and you should be in good shape. Granted it is confusing. But what's the alternative? Maybe calling every attribute that needs to refer to a uniqueKey literally "uniqueKey"? I don't think we want to have attributes changing their name based on the uniqueKey field name. Erik
Re: QueryElevationComponent
Thanks Ryan for your answer. >The only thing that may be weird is that if you id field is named "myid", your elevate.xml file still refers to "id" as the unique key. Is that what you are refering to? yes, my id field is named "myid", but elevate.xml expects its name is "id" . Please find below more info: I' using the very last revision (720030) I also tried both and In the former case I've got a tomcat error: HTTP Status 500 - Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: false in solr.xml - org.apache.solr.common.SolrException: Error initializing QueryElevationComponent. at org.apache.solr.handler.component.QueryElevationComponent.inform(QueryElevationComponent.java:200) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:319) at org.apache.solr.core.SolrCore.(SolrCore.java:563) at ... In the latter case solr works but the QueryElevation does not. The query I' using is: http://localhost:8080/solr/post1/select/?q=cars&version=2.2&start=0&rows=10&indent=on&enableElevation=true thanks Paolo On Sun, Nov 23, 2008 at 12:29 AM, Ryan McKinley <[EMAIL PROTECTED]> wrote: > hymm -- that *should* not be the case. The id field in > QueryElevationComponent uses the globally defined field: > >SchemaField sf = core.getSchema().getUniqueKeyField(); >... >idField = sf.getName().intern(); > > The only thing that may be weird is that if you id field is named "myid", > your elevate.xml file still refers to "id" as the unique key. Is that what > you are refering to? > > I have not tested this, so it may very well be broken. > > ryan > > > > > On Nov 22, 2008, at 5:31 PM, Paolo Ruscitti wrote: > > I have a question about QueryElevationComponent. >> >> I'm trying to use it but it seems it works properly if, and only if, the >> id >> field name in definition is '*id*'. >> >> so if I have *myid*, it does not work. >> >> >> Could you please tell me what I'm doing wrong? >> thaks a lot >> >> Paolo >> >> - this is my elevate.xml >> >> >> >> >> >> >> >> - I added at the tail of solrconfig.xml file >> ... >> >> >> >> >> string >> elevate.xml >> >> >> >> > startup="lazy"> >> >> explicit >> >> >> elevator >> >> >> >> >> >> - in my schema I have >> >> > required="true" >> /> >> ... >> myid >> > >
Re: [VOTE] Community Logo Preferences
https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png https://issues.apache.org/jira/secure/attachment/12394366/solr3_maho.png https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_a_red.jpg https://issues.apache.org/jira/secure/attachment/12394266/apache_solr_b_red.jpg https://issues.apache.org/jira/secure/attachment/12394218/solr-solid.png
Compiling Solr 1.3.0 + KStem
I was hoping to try using KStem with Solr 1.3.0, but am having trouble getting it to compile. With a fresh Solr 1.3.0 that will build successfully, I unzipped the KStemSolr.zip within the apache-solr-1.3.0 directory, but when I then try to build (using Ant 1.7.1 and Sun HotSpot JDK 1.6.0 update 10), I get: [EMAIL PROTECTED]:/usr/local/build/apache-solr-1.3.0$ ant compile Buildfile: build.xml init-forrest-entities: [mkdir] Created dir: /usr/local/build/apache-solr-1.3.0/build [mkdir] Created dir: /usr/local/build/apache-solr-1.3.0/build/web compile-common: [mkdir] Created dir: /usr/local/build/apache-solr-1.3.0/build/ common [javac] Compiling 36 source files to /usr/local/build/apache-solr-1.3.0/build/common [javac] Note: /usr/local/build/apache-solr-1.3.0/src/java/org/apache/solr/common/ util/FastInputStream.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. compile: [mkdir] Created dir: /usr/local/build/apache-solr-1.3.0/build/core [javac] Compiling 350 source files to /usr/local/build/apache-solr-1.3.0/build/core [javac] /usr/local/build/apache-solr-1.3.0/src/java/org/apache/solr/analysis/ KStemFilterFactory.java:63: cannot find symbol [javac] symbol : method init (org .apache .solr.core.SolrConfig,java.util.Map) [javac] location: class org.apache.solr.analysis.BaseTokenFilterFactory [javac] super.init(solrConfig, args); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 1 error BUILD FAILED /usr/local/build/apache-solr-1.3.0/build.xml:125: The following error occurred while executing this line: /usr/local/build/apache-solr-1.3.0/common-build.xml:149: Compile failed; see the compiler error output for details. I've also tried to build the KStem filter factory using the KStem.jar via the instructions on the Wiki, but I am not sure I'm doing the right things in steps 3 and 5: 3. Modify the package name on the source files to match your install Does that mean to change package org.apache.lucene.analysis; to org.apache.solr.analysis? 5. Build the jar file and drop that into your Solr /lib directory. Nothing I've tried here gives me any .class files, just more "cannot find symbol" errors. Any suggestions would be much appreciated. I am definitely a novice in building Java apps, so I could be missing something very simple here. Thanks, -Chris
Re: [VOTE] Community Logo Preferences
https://issues.apache.org/jira/secure/attachment/12394267/apache_solr_c_blue.jpg https://issues.apache.org/jira/secure/attachment/12394265/apache_solr_b_blue.jpg https://issues.apache.org/jira/secure/attachment/12394263/apache_solr_a_blue.jpg b.t.w, 2 logo's are missing: https://issues.apache.org/jira/secure/attachment/12394270/apache_solr_d_blue.jpg and https://issues.apache.org/jira/secure/attachment/12394271/apache_solr_d_red.jpg Ryan McKinley schreef op 11/23/2008 05:59 PM: Please submit your preferences for the solr logo. For full voting details, see: http://wiki.apache.org/solr/LogoContest#Voting The eligible logos are: http://people.apache.org/~ryan/solr-logo-options.html Any and all members of the Solr community are encouraged to reply to this thread and list (up to) 5 ranked choices by listing the Jira attachment URLs. Votes will be assigned a point value based on rank. For each vote, 1st choice has a point value of 5, 5th place has a point value of 1, and all others follow a similar pattern. https://issues.apache.org/jira/secure/attachment/12345/yourfrstchoice.jpg https://issues.apache.org/jira/secure/attachment/34567/yoursecondchoice.jpg ... This poll will be open until Wednesday November 26th, 2008 @ 11:59PM GMT When the poll is complete, the solr committers will tally the community preferences and take a final vote on the logo. A big thanks to everyone would submitted possible logos -- its great to see so many good options.
[VOTE] Community Logo Preferences
Please submit your preferences for the solr logo. For full voting details, see: http://wiki.apache.org/solr/LogoContest#Voting The eligible logos are: http://people.apache.org/~ryan/solr-logo-options.html Any and all members of the Solr community are encouraged to reply to this thread and list (up to) 5 ranked choices by listing the Jira attachment URLs. Votes will be assigned a point value based on rank. For each vote, 1st choice has a point value of 5, 5th place has a point value of 1, and all others follow a similar pattern. https://issues.apache.org/jira/secure/attachment/12345/yourfrstchoice.jpg https://issues.apache.org/jira/secure/attachment/34567/yoursecondchoice.jpg ... This poll will be open until Wednesday November 26th, 2008 @ 11:59PM GMT When the poll is complete, the solr committers will tally the community preferences and take a final vote on the logo. A big thanks to everyone would submitted possible logos -- its great to see so many good options.
Re: Question about Query Phrase Slop (qs) in dismax
Somebody please help clear this doubt. What more could i do with the dismax handler to remove results that don't have 'word1'', 'word2', 'word3' etc in a search phrase not within 5 words of one another, to not come up in the results? anuvenk wrote: > > From the solr wiki, it sounded like if qs is set to 5 for example, & if > the search term is 'child custody', only docs with 'child' & 'custody' > within 5 words of one another would be returned in results. Is this > correct? If so, it doesn't seem to be working for me. I see docs with > 'child' & 'custody' more than 5 words of one another (excluding stop > words) which is resulting in bad user experience as those docs are not so > relevant. What more could i do to improve quality in the results? > -- View this message in context: http://www.nabble.com/Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20648109.html Sent from the Solr - User mailing list archive at Nabble.com.
Using Solr for indexing emails
Hi, A while ago I implemented searching emails with Solr for my IMAP server (www.dovecot.org). Seems to work ok, but now I'm having a bit of trouble trying to figure out how to implement searching from multiple mailboxes efficiently. Would be great if someone had suggestions how to do things better. The main problem is that before doing the search, I first have to check if there are any unindexed messages and then add them to Solr. This is done using a query like: - fl=uid - rows=1 - sort=uid desc - q=uidv: box: user: So it returns the highest IMAP UID field (which is an always-ascending integer) for the given mailbox (you can ignore the uidvalidity). I can then add all messages with higher UIDs to Solr before doing the actual search. When searching multiple mailboxes the above query would have to be sent to every mailbox separately. That really doesn't seem like the best solution, especially when there are a lot of mailboxes. But I don't think Solr has a way to return "highest uid field for each box:"? Is that above query even efficient for a single mailbox? I did consider using separate documents for storing the highest UID for each mailbox, but that causes annoying desynchronization possibilities. Especially because currently I can just keep sending documents to Solr without locking and let it drop duplicates automatically (should be rare). With per-mailbox highest-uid documents I can't really see a way to do this without locking or allowing duplicate fields to be added and later some garbage collection deleting all but the one highest value (annoyingly complex). I could of course also keep track of what's indexed on Dovecot's side, but that could also lead to desynchronization issues and I'd like to avoid them. I guess the ideal solution would be if it was somehow possible to create a SQL-like trigger that updates the per-mailbox highest-uid document whenever adding a new document with a higher UID value. signature.asc Description: This is a digitally signed message part