Re: Omitting norms question

2010-03-19 Thread Marc Sturlese
Should I include not omit-norms on any fields that I would like to boost via a boost-query/function query? You don't have to set norms to use boost queries or functions. Just have to set them when you want to boost docs or fields at indexing time. What about sortable fields? Facetable fields?

Re: good spell dictionary

2010-03-19 Thread michaelnazaruk
Help me, please! Where I can only buy good spell dictionary? -- View this message in context: http://old.nabble.com/good-spell-dictionary-tp27950854p27950921.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: good spell dictionary

2010-03-19 Thread Jeff Zhang
Erick's right, use the terms in index. On Fri, Mar 19, 2010 at 5:05 PM, michaelnazaruk michaelnaza...@gmail.comwrote: Help me, please! Where I can only buy good spell dictionary? -- View this message in context: http://old.nabble.com/good-spell-dictionary-tp27950854p27950921.html Sent

Re: good spell dictionary

2010-03-19 Thread Markus Jelsma
/usr/share/dict/american-english On Friday 19 March 2010 10:05:50 michaelnazaruk wrote: From: michaelnazaruk michaelnaza...@gmail.com To: solr-user@lucene.apache.org Date: Today 10:05:50 Spam Status: Spamassassin 45,2615% probability of being spam. Full report: No,

Re: some snynonym clarifications

2010-03-19 Thread Markus Jelsma
On Thursday 18 March 2010 17:47:45 Mark Fletcher wrote: Hi, Thanks for the mail. I had tried the WIKI. My doubts remaining were mainly:- 1. If we have synonyms specified and they replace your search keyword with the ones specified wouldn't we face a risk of our original keyword

StreamingUpdateSolrServer being inefficient when adding is not as fast as empying its queue

2010-03-19 Thread Tim Terlegård
StreamingUpdateSolrServer logs starting runner: ..., sends a POST with stream.../stream and I guess also opens a new HTTP connection every time it has managed to empty its queue. In StreamingUpdateSolrServer.java it says this: // info is ok since this should only happen once for each thread

Switching cores dynamically

2010-03-19 Thread muneeb
Hi, I have indexed almost 7 million articles on two separate cores, each with their own conf/ and data/ folder, i.e. they have their individual index. What I normally do is, use core0 for querying and core1 for any updates and once updates are finished i copy the index of core1 to core0's data

Re: Switching cores dynamically

2010-03-19 Thread Michael Kuhlmann
On 03/19/10 11:18, muneeb wrote: Hi, I have indexed almost 7 million articles on two separate cores, each with their own conf/ and data/ folder, i.e. they have their individual index. What I normally do is, use core0 for querying and core1 for any updates and once updates are finished i

Re: Switching cores dynamically

2010-03-19 Thread David Stuart
Using a multicore setup should do the trick see http://wiki.apache.org/solr/CoreAdmin specificly the swap option Cheers David Stuart On 19 Mar 2010, at 10:18, muneeb muneeba...@hotmail.com wrote: Hi, I have indexed almost 7 million articles on two separate cores, each with their own

Re: some snynonym clarifications

2010-03-19 Thread Mark Fletcher
Thanks Marcus! I got it. BR, Mark. On Fri, Mar 19, 2010 at 5:50 AM, Markus Jelsma mar...@buyways.nl wrote: On Thursday 18 March 2010 17:47:45 Mark Fletcher wrote: Hi, Thanks for the mail. I had tried the WIKI. My doubts remaining were mainly:- 1. If we have synonyms specified

SOLR-1316 How To Implement this autosuggest component ???

2010-03-19 Thread stocki
hello.. i try to implement autosuggest component from these link: http://issues.apache.org/jira/browse/SOLR-1316 but i have no idea how to do this !?? can anyone get me some tipps ? -- View this message in context:

Re: Generating a sitemap

2010-03-19 Thread Erik Hatcher
Jon - Very cool use of VelocityResponseWriter! Would you happen to have a sitemap.vm template to contribute? I realize there'd need to be an external URL configurable, but this would be trivially added as a request parameter and leveraged in the template. Erik p.s. Anyone

Optional terms in dismax query

2010-03-19 Thread Andrea Campi
Hi all, a customer recently asked me an interesting question. They are a travel website, and they would like to support queries like Easter on a tropical island beach. They would like this to match anything that would be matched by on a tropical island beach, but also match e.g. Easter island

Re: SOLR-1316 How To Implement this autosuggest component ???

2010-03-19 Thread Erick Erickson
This link might help. Although it talks about getting the data in with DIH, you can skip that part http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

Re: Return all Facets?

2010-03-19 Thread Smiley, David W.
Jeesh, I really need to check Jira more! This is the second time I re-invented a wheel. At least in this case it was semi-trivial. My implementation includes a regexp in which a field must match for inclusion. I suppose an even more thorough implementation would add a regexp to specifically

Re: How many facet values are too many?

2010-03-19 Thread Smiley, David W.
On Mar 18, 2010, at 10:53 PM, Andy wrote: My understanding is that too many facet values will decrease performance How many is too many? Are there any rules of thumb for this? 2 related questions: - I expect a facet field to have many values (values are user generated), any thing I

better results and differentiate my products

2010-03-19 Thread stocki
Hello. i try to get better results of my search. so i need some help ;) here my problem or better how i want my search: if i search for TOMTOM and my results are something like this. autosuggestion. MERIAN scout Themenguide - Feinschmecker - für Garmin und TomTom {

Re: Generating a sitemap

2010-03-19 Thread Jon Baer
It's unfortunately actually a pretty domain specific thing (urls, content, etc), there are also limits @ certain points (see ... but we took CNN.com as a model, for example: http://www.cnn.com/video_sitemap_index.xml http://www.cnn.com/sitemap_videos_0001.xml Then you just line up the big 3 w/

Re: Return all Facets?

2010-03-19 Thread homerlex
Its still not clear to me how to use the LukeRequestHandler (from the API) to get a list of all existing facet fields. Is there an example somewhere? Thanks for the help! -- View this message in context: http://old.nabble.com/Return-all-Facets--tp27944999p27950957.html Sent from the Solr -

RE: PDFBox/Tika Performance Issues

2010-03-19 Thread Giovanni Fernandez-Kincade
Yeah I had tested it previously and that works... -Original Message- From: Mattmann, Chris A (388J) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Friday, March 19, 2010 12:04 AM To: solr-user@lucene.apache.org Subject: Re: PDFBox/Tika Performance Issues Hi Giovanni, Let's try and

Re: Omitting norms question

2010-03-19 Thread blargy
Ok so as if I wanted to add boost to fields at indexing time then I should include norms. On the other hand if I just want to boost at query time then its quite alright to omit norms. Anyone mind explaining what norms are in layman's terms ;) Marc Sturlese wrote: Should I include not

Re: SOLR-1316 How To Implement this autosuggest component ???

2010-03-19 Thread Andrzej Bialecki
On 2010-03-19 13:03, stocki wrote: hello.. i try to implement autosuggest component from these link: http://issues.apache.org/jira/browse/SOLR-1316 but i have no idea how to do this !?? can anyone get me some tipps ? Please follow the instructions outlined in the JIRA issue, in the comment

Re: PDFBox/Tika Performance Issues

2010-03-19 Thread Mattmann, Chris A (388J)
Ah, OK. Let me try and stand up a SolrCell instance and perform the same test you are and see if I can duplicate it. Hopefully I can get back to you today on this... Cheers, Chris On 3/19/10 7:43 AM, Giovanni Fernandez-Kincade gfernandez-kinc...@capitaliq.com wrote: Yeah I had tested it

RE: Omitting norms question

2010-03-19 Thread Steven A Rowe
Hi blargy, Norms are: - a field-specific multiplicative document scoring factor - the product of three factors: user-settable 1) field boost and 2) document boost (both default to 1.0), along with the 3) field length norm, defined in DefaultSimilarity as 1/sqrt(# terms). - encoded as a

Re: Switching cores dynamically

2010-03-19 Thread Henrib
Hi, You could (theoretically) reduce the down-time to zero using a 'swap' command: http://wiki.apache.org/solr/CoreAdmin?highlight=%28swap%29#SWAP Cheers Henrib muneeb wrote: Hi, I have indexed almost 7 million articles on two separate cores, each with their own conf/ and data/ folder,

place of log4j.properties file

2010-03-19 Thread Király Péter
Hi, on page 205 of the Solr 1.4 Enterprise Search Server book there is an example, of how to reference log4j.properties file from Jetty. I tried that and several other methods (like -Dlog4j.properties=path to file), but the only working way was to put create a WEB-INF/classes directory inside

Multicore and TermVectors

2010-03-19 Thread Christian Fontana
Hello. My new to solr and I'm trying to setup solr with a multicore configuration. All cores shares the same schema.xml and I'm using index sharding. I need term vectors data to generate tag clouds but when I try to run a search across all cores to retrieve term vectors data the engine raises

Re: PDFBox/Tika Performance Issues

2010-03-19 Thread Grant Ingersoll
On Mar 16, 2010, at 6:55 PM, Giovanni Fernandez-Kincade wrote: 3. I took the resulting tika-app-0.7-SNAPSHOT.jar, copied it to the /Lib folder for my Solr Core, and renamed it to the name of the existing Tika Jar (tika-0.3.jar). What version are you on of Solr? It's been a while

RE: PDFBox/Tika Performance Issues

2010-03-19 Thread Giovanni Fernandez-Kincade
Solr Specification Version: 1.4.0.2009.10.14.08.05.59 Solr Implementation Version: nightly exported - yonik - 2009-10-14 08:05:59 Lucene Specification Version: 2.9.1-dev Lucene Implementation Version: 2.9.1-dev 824988 - 2009-10-13 21:47:13 Current Time: Fri Mar 19 13:11:31 EDT 2010 Server Start

Re: PDFBox/Tika Performance Issues

2010-03-19 Thread Grant Ingersoll
Can you try trunk? On Mar 19, 2010, at 1:12 PM, Giovanni Fernandez-Kincade wrote: Solr Specification Version: 1.4.0.2009.10.14.08.05.59 Solr Implementation Version: nightly exported - yonik - 2009-10-14 08:05:59 Lucene Specification Version: 2.9.1-dev Lucene Implementation Version: 2.9.1-dev

Re: Term Highlighting without store text in index

2010-03-19 Thread dbejean
Thank you, I will test this. Alexey-34 wrote: Hey Dominique, See http://www.lucidimagination.com/search/document/5ea8054ed8348e6f/highlight_arbitrary_text#3799814845ebf002 Although it might be not good solution for huge texts, wildcard/phrase queries.

Re: [ANN] Zoie Solr Plugin - Zoie Solr Plugin enables real-time update functionality for Apache Solr 1.4+

2010-03-19 Thread brad anderson
Indeed, which is why I'm wondering what is Zoie adding if you still need to commit to search recent documents. Does anyone know? Thanks, Brad On 18 March 2010 19:41, Erik Hatcher erik.hatc...@gmail.com wrote: When I don't do the commit, I cannot search the documents I've indexed. - that's

Multi Select Facets through Java API

2010-03-19 Thread homerlex
I have a facet field called Cars. I want the user to be able to select multiple values (Camaro, Corvette, etc) and the results should include all records with Cars = Camaro OR Cars = Corvette. Are there samples somewhere on how to do this with the Java API? Does anything special need to be set

Re: Issue with exact matching

2010-03-19 Thread Alex Thurlow
Thanks so much. That works really well now. So this brings up a complaint I have with the Solr documentation. I see very few actual examples. If I had seen any example of searching for a multi-word search, I assume it would have had these parentheses. -Alex On 3/18/2010 5:54 PM,

Re: How many facet values are too many?

2010-03-19 Thread Andy
Are you referring to the facet.limit parameter? If it is set to N, does that mean Solr only has to process N values, or that Solr will still process all the values but only return the top N ones? --- On Fri, 3/19/10, Smiley, David W. dsmi...@mitre.org wrote: From: Smiley, David W.

Re: How many facet values are too many?

2010-03-19 Thread Smiley, David W.
Yes, the limit parameter is closest in concept to the LIMIT clause in SQL. It defaults to 100 so you'll see no more than that many facet values. There's also minCount which will establish a threshold if you don't want to see counts less than this number. It's common to set it to 1 so you

Delta-Import quick question

2010-03-19 Thread blargy
Does the DIH delta-import automatically commit and optimize after its done? ... str name=Total Changed Documents8120/str str name=Total Documents Processed0/str ... What is the difference between these? Usually I see the Total Documents Processed. -- View this message in context:

Re: Issue with exact matching

2010-03-19 Thread Erick Erickson
Glad I could help. The wonderful thing about the Wiki is that all you have to do is create an account to edit it. The new folks coming in often have a perspective on things that old timers don't remember as being confusing, regardless of how painful *their* learning curve was Best Erick On