Re: Evangelism
DollarDays.com is currently using it and we display the powered by logo as at least a gesture of giving back to the community. Ryan T. Grange, IT Manager DollarDays International, Inc. rgra...@dollardays.com (480)922-8155 x106 On 4/29/2010 11:10 AM, Daniel Baughman wrote: Hi I'm new to the list here, I'd like to steer someone in the direction of Solr, and I see the list of companies using solr, but none have a "power by solr" logo or anything. Does anyone have any great links with evidence to majorly successful solr projects? Thanks in advance, Dan B.
Re: QueryElevationComponent blues
I'd read that too, but in the debug data queryBoosting is showing matches on our int typed identifiers (though it does show it as 123456). Is the problem that it can match against an integer, but it can't reorder them in the results? This seems unlikely as using a standard query and elevation does cause otherwise lower results to jump to the top of the results. I've looked at the source and noticed the check for a string type in there. I'm not sure why my Solr instance seems okay with an int for a unique identifier. Tried forceElevation=true with qt=dismax and still no effect on placement. We don't want to give up field, phrase, and formula boosting when using the standard request handler just to have elevation work. Ryan T. Grange, IT Manager DollarDays International, Inc. rgra...@dollardays.com (480)922-8155 x106 On 3/8/2010 11:13 PM, Jon Baer wrote: Maybe some things to try: * make sure your uniqueKey is string field type (ie if using int it will not work) * forceElevation to true (if sorting) - Jon On Mar 9, 2010, at 12:34 AM, Ryan Grange wrote: Using Solr 1.4. Was using the standard query handler, but needed the boost by field functionality of qf from dismax. So we altered the query to boost certain phrases against a given field. We were using QueryElevationComponent ("elevator" from solrconfig.xml) for one particular entry we wanted at the top, but because we aren't using a pure q value, elevator never finds a match to boost. We didn't realize it at the time because the record we were elevating eventually became the top response anyway. Recently added a _val_:formula to the q value to juice records based on a value in the record. Now we have need to push a few other records to the top, but we've lost the ability to use elevate.xml to do it. Tried switching to dismax using qf, pf, qs, ps, and bf with a "pure" q value, and debug showed queryBoost with a match and records, but they weren't moved to the top of the result set. What would really help is if there was something for elevator akin to spellcheck.q like elevation.q so I could pass in the actual user phrase while still performing all the other field score boosts in the q parameter. Alternatively, if anyone can explain why I'm running into problems getting QueryElevationComponent to move the results in a dismax query, I'd be very thankful. -- Ryan T. Grange
QueryElevationComponent blues
Using Solr 1.4. Was using the standard query handler, but needed the boost by field functionality of qf from dismax. So we altered the query to boost certain phrases against a given field. We were using QueryElevationComponent ("elevator" from solrconfig.xml) for one particular entry we wanted at the top, but because we aren't using a pure q value, elevator never finds a match to boost. We didn't realize it at the time because the record we were elevating eventually became the top response anyway. Recently added a _val_:formula to the q value to juice records based on a value in the record. Now we have need to push a few other records to the top, but we've lost the ability to use elevate.xml to do it. Tried switching to dismax using qf, pf, qs, ps, and bf with a "pure" q value, and debug showed queryBoost with a match and records, but they weren't moved to the top of the result set. What would really help is if there was something for elevator akin to spellcheck.q like elevation.q so I could pass in the actual user phrase while still performing all the other field score boosts in the q parameter. Alternatively, if anyone can explain why I'm running into problems getting QueryElevationComponent to move the results in a dismax query, I'd be very thankful. -- Ryan T. Grange
Thank you all for Solr 1.4
Not posting a problem or a solution. Just wanted to get word back to the Solr developers, bug testers, and mailing list gurus how much I love Solr 1.4. Our site search is more accurate, the search box offers better suggestions must faster than before, and the elevate functionality has appeased the product promotion department to no end. I'd offer you a thousand thanks, but the spam filters would hate it, so the safest way to thank you... Me i = new Me(); for (int c = 0; c < 1000; c++) { i.ThankYou(); } Ryan T. Grange, IT Manager DollarDays International, Inc. http://www.dollardays.com/ rgra...@dollardays.com
Re: Upgrading 1.2.0 to 1.3.0 solr
Actually, it was a very straightforward installation. I just tweaked the configurations afterward to better support for the new 1.3.0 features I wanted to use (spelling suggestions and faceting). Ryan T. Grange, IT Manager DollarDays International, Inc. rgra...@dollardays.com (480)922-8155 x106 Francis Yakin wrote: DO you have experience to upgrade from 1.2.0 to 1.3.0? In other words, do you have any suggestions or best if you have any docs or instructions for doing this. I appreciate if you can help me. Thanks Francis -Original Message- From: Ryan Grange [mailto:rgra...@dollardays.com] Sent: Thursday, June 11, 2009 8:39 AM To: solr-user@lucene.apache.org Subject: Re: Upgrading 1.2.0 to 1.3.0 solr I disagree with waiting that month. At this point, most of the kinks in the upgrade from 1.2 to 1.3 have been worked out. Waiting for 1.4 to come out risks you becoming a guinea pig for the upgrade procedure. Plus, if any show-stoppers come along delaying 1.4, you delay implementation of your auto-complete function. When 1.4 comes out, if it has any features you feel compel an upgrade, you can begin another round of testing and migration, but don't upgrade a production system just for the sake of being bleeding edge. Ryan T. Grange, IT Manager DollarDays International, Inc. rgra...@dollardays.com (480)922-8155 x106 Otis Gospodnetic wrote: Francis, If you can wait another month or so, you could skip 1.3.0, and jump to 1.4 which will be released soon. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: Francis Yakin To: "solr-user@lucene.apache.org" Sent: Wednesday, June 10, 2009 1:17:25 AM Subject: Upgrading 1.2.0 to 1.3.0 solr I am in process to upgrade our solr 1.2.0 to solr 1.3.0 Our solr 1.2.0 now is working fine, we just want to upgrade it cause we have an application that requires some function from 1.3.0( we call it autocomplete). Currently our config files on 1.2.0 are as follow: Solrconfig.xml Schema.xml ( we wrote this in house) Index_synonyms.txt ( we also modified and wrote this in house) Scripts.conf Protwords.txt Stopwords.txt Synonyms.txt I understand on 1.3.0 , it has new solrconfig.xml . My questions are: 1) what config files that I can reuse from 1.2.0 for 1.3.0 can I use the same schema.xml 2) Solrconfig.xml, can I use the 1.2.0 version or I have to stick with 1.3.0 If I need to stick with 1.3.0, what that I need to change. As of right I am testing it in my sandbox, so it doesn't work. Please advice, if you have any docs for upgrading 1.2.0 to 1.3.0 let me know. Thanks in advance Francis Note: I attached my solrconfigand schema.xml in this email -Inline Attachment Follows- {edited out by Ryan for brevity}
Re: Upgrading 1.2.0 to 1.3.0 solr
I disagree with waiting that month. At this point, most of the kinks in the upgrade from 1.2 to 1.3 have been worked out. Waiting for 1.4 to come out risks you becoming a guinea pig for the upgrade procedure. Plus, if any show-stoppers come along delaying 1.4, you delay implementation of your auto-complete function. When 1.4 comes out, if it has any features you feel compel an upgrade, you can begin another round of testing and migration, but don't upgrade a production system just for the sake of being bleeding edge. Ryan T. Grange, IT Manager DollarDays International, Inc. rgra...@dollardays.com (480)922-8155 x106 Otis Gospodnetic wrote: Francis, If you can wait another month or so, you could skip 1.3.0, and jump to 1.4 which will be released soon. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: Francis Yakin To: "solr-user@lucene.apache.org" Sent: Wednesday, June 10, 2009 1:17:25 AM Subject: Upgrading 1.2.0 to 1.3.0 solr I am in process to upgrade our solr 1.2.0 to solr 1.3.0 Our solr 1.2.0 now is working fine, we just want to upgrade it cause we have an application that requires some function from 1.3.0( we call it autocomplete). Currently our config files on 1.2.0 are as follow: Solrconfig.xml Schema.xml ( we wrote this in house) Index_synonyms.txt ( we also modified and wrote this in house) Scripts.conf Protwords.txt Stopwords.txt Synonyms.txt I understand on 1.3.0 , it has new solrconfig.xml . My questions are: 1) what config files that I can reuse from 1.2.0 for 1.3.0 can I use the same schema.xml 2) Solrconfig.xml, can I use the 1.2.0 version or I have to stick with 1.3.0 If I need to stick with 1.3.0, what that I need to change. As of right I am testing it in my sandbox, so it doesn't work. Please advice, if you have any docs for upgrading 1.2.0 to 1.3.0 let me know. Thanks in advance Francis Note: I attached my solrconfigand schema.xml in this email -Inline Attachment Follows- {edited out by Ryan for brevity}
Re: How to get the score in the result
It would help to see your query, but you basically add ",score" to whatever you're sending over in the "fl" variable. If you aren't passing "fl", you may want to use "fl=*,score". Ryan T. Grange, IT Manager DollarDays International, Inc. rgra...@dollardays.com (480)922-8155 x106 ayyanar wrote: final QueryResponse queryResponse = server.query(query); final List results = queryResponse.getBeans(DocumentWrapper.class); This is the way i do the query in the solr. DocumentWrapper is my class which maps to the document fields. Can anyone let me know how the documentwrapper can return the score of the document? How to get the solr score of each document?
Re: Deletion of indexes.
I got around this problem by using a trigger on the table I index that records the values of deleted items in a queue table so when my next Solr update rolls around it sends a remove request for that record's ID. Once the Solr deletion is done, I remove that ID from the queue table. Of course, you have to be on MySQL 5.0 or above to have that available to you. Otherwise, you'll have to manually add something to your deletion queries to record all the IDs you're about to delete to a queue table. Ryan T. Grange, IT Manager DollarDays International, Inc. Tushar_Gandhi wrote: Hi, I am using solr 1.3. I am facing a problem to delete the index. I have mysql database. Some of the data from database is deleted, but the indexing for those records is still present. Due to that I am getting those records in search result. I don't want this type of behavior. I want to delete those indexes which are not present in database. Also, I don't know which records are deleted from database and present in index. Is there any way to solve this problem? Also I think that re indexing will not solve my problem, because it will re index only the records which are present in database and don't bother about the indexes which don't have reference in database. Can anyone have solution for this? Thanks, Tushar
Re: Release date of SOLR 1.3
It would be nice to see some kind of update to the Solr website regarding what's holding up a 1.3 release. I look at that a lot more often than I look at this mailing list to see whether or not there's a new version I should be looking to test out. Ryan Grange, IT Manager DollarDays International, LLC [EMAIL PROTECTED] 480-922-8155 x106 Noble Paul ??? ?? wrote: If a feature that is really big (say distributed search) is half baked and not ready for primetime, we must hold on the release till it is completely fixed. That is not to say that every possible enhancements to that feature must be incorporated before we can do a release. If the new changes are not going to break the existing system we can go ahead. A faster release cycle can drive the adoption of a lot of new features because users are not very confident of nightly builds and they tend to stick with the latest realease available. SolrJ is a very good example. So many users still have their own sweet client libraries in production because they think SolrJ is yet in development and there is no release. --Noble On Wed, May 21, 2008 at 11:46 PM, Chris Hostetter <[EMAIL PROTECTED]> wrote: : One year between releases is a very long time for such a useful and : dynamic system. Are project leaders willing to (re)consider the : development process to prioritize improvements/features scope into : chunks that can be accomplished in shorter time frames - say 90 days? : In my experience, short dev iteration cycles that fix time and vary : scope produce better results from all perspectives. I'm all in favor of shorter release cycles ... but not everything can be broken down into chunks that can be implmeneted in a small time frame, and even if they can, you don't always know that the solution to "chunk1" is leading down the right path. Solr 9and hte Lucene community as a whole) has a long history and deep "cultural" believe in aggressive backwards compatibility .. there is a lot of resistence to the idea of a release that includes the first "chunk" of a larger feature without a strong confidence that the API provided by that chunk is something people are willing to maintain for a long time. At the ned of the day, hat gets people motivated to do a release is discussions on solr-dev where someone says: "i think we need ot have a rlease, and i'm willing to be the release manager. i think we should hold of on committing patches X,Y, and Z because they don't seem ready for prime time yet, and i think we should move forward on trying to commit patches A, B, and C because they seem close to done. what does everybody else think?" -Hoss
Re: Companies Using Solr
It's definitely not immutable. A while back I added DollarDays International. Just remember to be polite and add yourself to the end of the list. Ryan Grange, IT Manager DollarDays International, LLC [EMAIL PROTECTED] 480-922-8155 x106 oleg_gnatovskiy wrote: Clay Webster wrote: Hey Folks, Reminder: http://wiki.apache.org/solr/PublicServers lists the sites using Solr. The listing is a bit thin. I know many people don't know about the list or don't have the time to add themselves to the list. I'd like to be able to promote open sourcing more systems (like Solr) and this information would help show it is helping a large community. Feel free to reply directly to me and I can add you. Thanks. --cw Clay Webster Associate VP, Platform Infrastructure CNET, Inc. (Nasdaq:CNET) How would you add to that list anyway? It's immutable.
Re: Unparseable date
Solr does use 24 hour dates. Are you positive there are no extraneous characters at the end of your date string such as carriage returns, spaces, or tabs? I have the same format in the code I've written and have never had a date parsing problem (yet). Ryan Grange, IT Manager DollarDays International, LLC [EMAIL PROTECTED] 480-922-8155 x106 Daniel Andersson wrote: Hi people I've got a date(&time] indexed with every document, defined as: multiValued="false" /> According to the schema.xml-file "The format for this date field is of the form 1995-12-31T23:59:59Z". Yet I'm getting the following error on SOME queries: Mar 5, 2008 10:32:53 AM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: java.text.ParseException: Unparseable date: "2008-02-12T15:02:06Z" at org.apache.solr.schema.DateField.toObject(DateField.java:173) at org.apache.solr.schema.DateField.toObject(DateField.java:83) at org.apache.solr.update.DocumentBuilder.loadStoredFields(DocumentBuilder.java:285) at com.pjaol.search.solr.component.LocalSolrQueryComponent.luceneDocToSolrDoc(LocalSolrQueryComponent.java:403) at com.pjaol.search.solr.component.LocalSolrQueryComponent.mergeResultsDistances(LocalSolrQueryComponent.java:363) at com.pjaol.search.solr.component.LocalSolrQueryComponent.process(LocalSolrQueryComponent.java:305) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:158) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:118) at org.apache.solr.core.SolrCore.execute(SolrCore.java:944) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:326) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:278) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Caused by: java.text.ParseException: Unparseable date: "2008-02-12T15:02:06Z" at java.text.DateFormat.parse(DateFormat.java:335) at org.apache.solr.schema.DateField.toObject(DateField.java:170) ... 27 more Could this be because we're using 24h instead of 12h? (the example seems to imply that 24h is what should be used though) Thanks in advance! Kind regards, Daniel
Re: Random search result
Your best bet would probably be to send a query for one record, get the number of matching records, and then repeat several requests for a single record starting at a random number from 1 to the number of records discovered on the first query. So if you do a search for state:NJ with start=0 and rows=1, you can pull out the /response/result[numFound] value. Next, you can query it for the same search criteria with a start={random number from 0 to numFound-1} and rows=1, grabbing the result each time. The first set of search results will likely still be in the cache for you to get 10 such queries through quickly. The trick is to make sure you don't pick the same random number twice. Perhaps an associative array using the number as the key to you can check the existence of. Ryan Grange, IT Manager DollarDays International, LLC [EMAIL PROTECTED] 480-922-8155 x106 Evgeniy Strokin wrote: I want to get sample from my search result. Not first 10 but 10 random (really random, not pseudo random) documents. For example if I run simple query like STATE:NJ no order by any field, just the query and get 10 first documents from my result set, will it be random 10 or pseudo random, like first 10 indexed or something like this? Thank you Gene
Re: Transform Update responses with XSLT?
It is absolutely possible to do such a thing. I wish I had more time right now to create even a sample. Unfortunately while I'm not too bad at XSLT, I haven't used it often enough to whip something up off the top of my head with any hope of it working the first time. I encourage you to read up on XSLT creation though as what you're asking is definitely doable. Possible problems I could see would be if you want the XSLT to also generate navigation links for you automatically. Basic formatting of the results shouldn't be a problem though. Ryan Grange, IT Manager DollarDays International, LLC [EMAIL PROTECTED] 480-922-8155 x106 Maximilian Hütter wrote: Hi, is there a way to transform a Solr update response with a XSLT-Stylesheet? It looks like the XSLTResponseWriter is only used for searches. Best regards, Max
Re: Best practice for storing relational data in Solr
I've found that Solr running on modest hardware (a 2.4 GHz PC running Windows XP Pro for testing changes) is able to index about 23,000 records in under three minutes. Assuming you aren't going to make too many typos in your naming, you should be fine just doing the re-indexing. Try timing your system. Make a change to about a thousand records and see how long it takes to index them. When indexing, I've found it's better to do them in batches for larger updates. I get up to a few hundred updates ready at a time and commit them at once. Goes much faster than committing each update document individually. Ryan Grange, IT Manager DollarDays International, LLC [EMAIL PROTECTED] 480-922-8155 x106 steve.lillywhite wrote: Hi all, This is a (possibly very naive) newbie question regarding Solr best practice... I run a website that displays/stores data on job applicants, together with information on where they came from (e.g. which recruiter), which office they are applying to, etc. This data is stored in a mySQL database. I currently have a basic search facility, but I plan to introduce Solr to improve this, by also storing applicant data in a Solr schema. My problem is that *related* applicant data can also be updated in the web GUI (e.g. if there was a typo a recruiter could be changed from “My Rcruiter” to “My Recruiter”, and I don’t know how best to reflect this in the Solr schema. Example: We may have 2 applicants that came from recruiter “My Recruiter”. If the name of this recruiter is altered in the GUI then I would have to reindex all 2 of those applicants in the Solr schema, which seems very overkill. The alternative would be if I didn’t store the recruiter name in the Solr schema, and instead only stored its mySQL database identifier. Then, I would need to parse any search results from Solr to put in the recruiter name before displaying the data in the GUI. So I guess I’m asking which of these is the better approach; 1. Use Solr to store the text value of related applicant data that exists in a relational mySQL database. Whenever that data is updated in the database reindex all dependent entries in the Solr schema. Advantage of this approach I guess is that search results can be returned from Solr and displayed as is (if XSLT is used). E.g. search result for “John Smith” of recruiter “My Recruiter” could be returned in the required HTML format from Solr, and displayed in the web GUI without any reformatting or further processing. 2. Use Solr to store database Ids of related applicant data that exists in a relational mySQL database. When that data is updated in the database there is no need to reindex Solr. However, search results from Solr will need to be parsed before they can be output in the web GUI. E.g. if Solr returns “John Smith” of recruiter with database ID 143, then 143 will need to be mapped back to “My Recruiter” by my application before it can be displayed. Can anyone offer any guidance here? Regards Steve No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.17.13/1208 - Release Date: 03/01/2008 15:52