Features not present in Solr

2010-03-19 Thread Srikanth B
Hello We are in the process of researching on Solr features. I am looking for two things 1. Features not available in Solr but present in other products like Endeca 2. What one shouldn't not expect from Solr Any thoughts ? Thanks in advance Srikanth

Re: Issue with exact matching

2010-03-19 Thread Erick Erickson
Glad I could help. The wonderful thing about the Wiki is that all you have to do is create an account to edit it. The new folks coming in often have a perspective on things that "old timers" don't remember as being confusing, regardless of how painful *their* learning curve was Best Erick On

Delta-Import quick question

2010-03-19 Thread blargy
Does the DIH delta-import automatically commit and optimize after its done? ... 8120 0 ... What is the difference between these? Usually I see the Total Documents Processed. -- View this message in context: http://old.nabble.com/Delta-Import-quick-question-tp27951022p27951022.html Sent from th

Re: Weired behaviour for certain search terms

2010-03-19 Thread Ahmet Arslan
> I tired adding &hl.maxAnalyzedChars=-1 to my search > query but it didnt > helped. > Just wanted to know if there are limitations on the certain > search terms. > Its bit strange that solr is not behaving properly for > certain terms > (especially returning the excerpts in highlighting > diction

Re: How many facet values are too many?

2010-03-19 Thread Smiley, David W.
Yes, the limit parameter is closest in concept to the LIMIT clause in SQL. It defaults to 100 so you'll see no more than that many facet values. There's also minCount which will establish a threshold if you don't want to see counts less than this number. It's common to set it to "1" so you do

Re: How many facet values are too many?

2010-03-19 Thread Andy
Are you referring to the facet.limit parameter? If it is set to N, does that mean Solr only has to process N values, or that Solr will still process all the values but only return the top N ones? --- On Fri, 3/19/10, Smiley, David W. wrote: From: Smiley, David W. Subject: Re: How many facet

Re: Issue with exact matching

2010-03-19 Thread Alex Thurlow
Thanks so much. That works really well now. So this brings up a complaint I have with the Solr documentation. I see very few actual examples. If I had seen any example of searching for a multi-word search, I assume it would have had these parentheses. -Alex On 3/18/2010 5:54 PM, Erick

Multi Select Facets through Java API

2010-03-19 Thread homerlex
I have a facet field called Cars. I want the user to be able to select multiple values (Camaro, Corvette, etc) and the results should include all records with Cars = Camaro OR Cars = Corvette. Are there samples somewhere on how to do this with the Java API? Does anything special need to be set

Re: [ANN] Zoie Solr Plugin - Zoie Solr Plugin enables real-time update functionality for Apache Solr 1.4+

2010-03-19 Thread brad anderson
Indeed, which is why I'm wondering what is Zoie adding if you still need to commit to search recent documents. Does anyone know? Thanks, Brad On 18 March 2010 19:41, Erik Hatcher wrote: > "When I don't do the commit, I cannot search the documents I've indexed." - > that's exactly how Solr witho

RE: PDFBox/Tika Performance Issues

2010-03-19 Thread Giovanni Fernandez-Kincade
Yeah I've been trying that - I keep getting this error when indexing a PDF with a trunk-build: Apache Tomcat/5.5.27 - Error report HTTP Status 500 - org.apache.solr.handler. ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryRes

Re: place of log4j.properties file

2010-03-19 Thread Király Péter
Thanks David! It works. Even with relative path, like -Dlog4j.configuration=file:etc/log4j.properties. Péter - Original Message - From: "Smiley, David W." To: Cc: "Eric Pugh" Sent: Friday, March 19, 2010 5:43 PM Subject: Re: place of log4j.properties file I believe that should

Re: Term Highlighting without store text in index

2010-03-19 Thread dbejean
Thank you, I will test this. Alexey-34 wrote: > > Hey Dominique, > > See > http://www.lucidimagination.com/search/document/5ea8054ed8348e6f/highlight_arbitrary_text#3799814845ebf002 > > Although it might be not good solution for huge texts, wildcard/phrase > queries. > http://issues.apache.or

Re: PDFBox/Tika Performance Issues

2010-03-19 Thread Grant Ingersoll
Can you try trunk? On Mar 19, 2010, at 1:12 PM, Giovanni Fernandez-Kincade wrote: > Solr Specification Version: 1.4.0.2009.10.14.08.05.59 > Solr Implementation Version: nightly exported - yonik - 2009-10-14 08:05:59 > Lucene Specification Version: 2.9.1-dev > Lucene Implementation Version: 2.9.1-

RE: PDFBox/Tika Performance Issues

2010-03-19 Thread Giovanni Fernandez-Kincade
Solr Specification Version: 1.4.0.2009.10.14.08.05.59 Solr Implementation Version: nightly exported - yonik - 2009-10-14 08:05:59 Lucene Specification Version: 2.9.1-dev Lucene Implementation Version: 2.9.1-dev 824988 - 2009-10-13 21:47:13 Current Time: Fri Mar 19 13:11:31 EDT 2010 Server Start Tim

Re: PDFBox/Tika Performance Issues

2010-03-19 Thread Grant Ingersoll
On Mar 16, 2010, at 6:55 PM, Giovanni Fernandez-Kincade wrote: > > 3. I took the resulting tika-app-0.7-SNAPSHOT.jar, copied it to the /Lib > folder for my Solr Core, and renamed it to the name of the existing Tika Jar > (tika-0.3.jar). What version are you on of Solr? It's been a while s

Re: place of log4j.properties file

2010-03-19 Thread Smiley, David W.
I believe that should have been -Dlog4j.configuration=file:/c:/foo/log4j.properties I've done this sort of thing many times before. I've also found it helpful to add -Dlog4j.debug (no value needed) to debug logging. http://logging.apache.org/log4j/1.2/manual.html ~ David Smiley Author: http

Multicore and TermVectors

2010-03-19 Thread Christian Fontana
Hello. My new to solr and I'm trying to setup solr with a multicore configuration. All cores shares the same schema.xml and I'm using index sharding. I need term vectors data to generate tag clouds but when I try to run a search across all cores to retrieve term vectors data the engine raises m

place of log4j.properties file

2010-03-19 Thread Király Péter
Hi, on page 205 of the Solr 1.4 Enterprise Search Server book there is an example, of how to reference log4j.properties file from Jetty. I tried that and several other methods (like -Dlog4j.properties=), but the only working way was to put create a WEB-INF/classes directory inside the solr.war

Re: Switching cores dynamically

2010-03-19 Thread Henrib
Hi, You could (theoretically) reduce the down-time to zero using a 'swap' command: http://wiki.apache.org/solr/CoreAdmin?highlight=%28swap%29#SWAP Cheers Henrib muneeb wrote: > > Hi, > > I have indexed almost 7 million articles on two separate cores, each with > their own conf/ and data/ folde

RE: Omitting norms question

2010-03-19 Thread Steven A Rowe
Hi blargy, Norms are: - a field-specific multiplicative document scoring factor - the product of three factors: user-settable 1) field boost and 2) document boost (both default to 1.0), along with the 3) field length norm, defined in DefaultSimilarity as 1/sqrt(# terms). - encoded as a positi

Re: PDFBox/Tika Performance Issues

2010-03-19 Thread Mattmann, Chris A (388J)
Ah, OK. Let me try and stand up a SolrCell instance and perform the same test you are and see if I can duplicate it. Hopefully I can get back to you today on this... Cheers, Chris On 3/19/10 7:43 AM, "Giovanni Fernandez-Kincade" wrote: Yeah I had tested it previously and that works...

Re: SOLR-1316 How To Implement this autosuggest component ???

2010-03-19 Thread Andrzej Bialecki
On 2010-03-19 13:03, stocki wrote: hello.. i try to implement autosuggest component from these link: http://issues.apache.org/jira/browse/SOLR-1316 but i have no idea how to do this !?? can anyone get me some tipps ? Please follow the instructions outlined in the JIRA issue, in the comment

Re: Omitting norms question

2010-03-19 Thread blargy
Ok so as if I wanted to add boost to fields at indexing time then I should include norms. On the other hand if I just want to boost at query time then its quite alright to omit norms. Anyone mind explaining what norms are in layman's terms ;) Marc Sturlese wrote: > >>>Should I include not omi

RE: PDFBox/Tika Performance Issues

2010-03-19 Thread Giovanni Fernandez-Kincade
Yeah I had tested it previously and that works... -Original Message- From: Mattmann, Chris A (388J) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Friday, March 19, 2010 12:04 AM To: solr-user@lucene.apache.org Subject: Re: PDFBox/Tika Performance Issues Hi Giovanni, Let's try and isolate

Re: Return all Facets?

2010-03-19 Thread homerlex
Its still not clear to me how to use the LukeRequestHandler (from the API) to get a list of all existing facet fields. Is there an example somewhere? Thanks for the help! -- View this message in context: http://old.nabble.com/Return-all-Facets--tp27944999p27950957.html Sent from the Solr - Us

Re: Generating a sitemap

2010-03-19 Thread Jon Baer
It's unfortunately actually a pretty domain specific thing (urls, content, etc), there are also limits @ certain points (see ... but we took CNN.com as a model, for example: http://www.cnn.com/video_sitemap_index.xml http://www.cnn.com/sitemap_videos_0001.xml Then you just line up the big 3 w/

better results and differentiate my products

2010-03-19 Thread stocki
Hello. i try to get better results of my search. so i need some help ;) here my problem or better how i want my search: if i search for TOMTOM and my results are something like this. autosuggestion. MERIAN scout Themenguide - Feinschmecker - für Garmin und TomTom { "response":{"numFound":46,"s

Re: How many facet values are too many?

2010-03-19 Thread Smiley, David W.
On Mar 18, 2010, at 10:53 PM, Andy wrote: > My understanding is that too many facet values will decrease performance > > How many is too many? Are there any rules of thumb for this? > > 2 related questions: > > - I expect a facet field to have many values (values are user generated), any > thi

Re: Return all Facets?

2010-03-19 Thread Smiley, David W.
Jeesh, I really need to check Jira more! This is the second time I re-invented a wheel. At least in this case it was semi-trivial. My implementation includes a regexp in which a field must match for inclusion. I suppose an even more thorough implementation would add a regexp to specifically

Re: SOLR-1316 How To Implement this autosuggest component ???

2010-03-19 Thread Erick Erickson
This link might help. Although it talks about getting the data in with DIH, you can skip that part http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

Optional terms in dismax query

2010-03-19 Thread Andrea Campi
Hi all, a customer recently asked me an interesting question. They are a travel website, and they would like to support queries like "Easter on a tropical island beach". They would like this to match anything that would be matched by "on a tropical island beach", but also match e.g. Easter isla

Re: Generating a sitemap

2010-03-19 Thread Erik Hatcher
Jon - Very cool use of VelocityResponseWriter! Would you happen to have a sitemap.vm template to contribute? I realize there'd need to be an external URL configurable, but this would be trivially added as a request parameter and leveraged in the template. Erik p.s. Anyone else

SOLR-1316 How To Implement this autosuggest component ???

2010-03-19 Thread stocki
hello.. i try to implement autosuggest component from these link: http://issues.apache.org/jira/browse/SOLR-1316 but i have no idea how to do this !?? can anyone get me some tipps ? -- View this message in context: http://old.nabble.com/SOLR-1316-How-To-Implement-this-autosuggest-component---

Re: some snynonym clarifications

2010-03-19 Thread Mark Fletcher
Thanks Marcus! I got it. BR, Mark. On Fri, Mar 19, 2010 at 5:50 AM, Markus Jelsma wrote: > > On Thursday 18 March 2010 17:47:45 Mark Fletcher wrote: > > Hi, > > > > Thanks for the mail. I had tried the WIKI. > > > > My doubts remaining were mainly:- > > > > 1. > > If we have synonyms specified

Re: Switching cores dynamically

2010-03-19 Thread David Stuart
Using a multicore setup should do the trick see http://wiki.apache.org/solr/CoreAdmin specificly the swap option Cheers David Stuart On 19 Mar 2010, at 10:18, muneeb wrote: Hi, I have indexed almost 7 million articles on two separate cores, each with their own conf/ and data/ folder,

Re: Switching cores dynamically

2010-03-19 Thread Michael Kuhlmann
On 03/19/10 11:18, muneeb wrote: > > Hi, > > I have indexed almost 7 million articles on two separate cores, each with > their own conf/ and data/ folder, i.e. they have their individual index. > > What I normally do is, use core0 for querying and core1 for any updates and > once updates are fin

Switching cores dynamically

2010-03-19 Thread muneeb
Hi, I have indexed almost 7 million articles on two separate cores, each with their own conf/ and data/ folder, i.e. they have their individual index. What I normally do is, use core0 for querying and core1 for any updates and once updates are finished i copy the index of core1 to core0's data f

StreamingUpdateSolrServer being inefficient when adding is not as fast as empying its queue

2010-03-19 Thread Tim Terlegård
StreamingUpdateSolrServer logs "starting runner: ...", sends a POST with ... and I guess also opens a new HTTP connection every time it has managed to empty its queue. In StreamingUpdateSolrServer.java it says this: // info is ok since this should only happen once for each thread log.info(

Re: some snynonym clarifications

2010-03-19 Thread Markus Jelsma
On Thursday 18 March 2010 17:47:45 Mark Fletcher wrote: > Hi, > > Thanks for the mail. I had tried the WIKI. > > My doubts remaining were mainly:- > > 1. > If we have synonyms specified and they replace your search keyword with the > ones specified wouldn't we face a risk of our original ke

Re: good spell dictionary

2010-03-19 Thread Markus Jelsma
/usr/share/dict/american-english On Friday 19 March 2010 10:05:50 michaelnazaruk wrote: > From: > michaelnazaruk > To: > solr-user@lucene.apache.org > Date: > Today 10:05:50 > > Spam Status: Spamassassin 45,2615% probability of being spam. > > Full report: > No, score=2.856 tagged_abo

Re: good spell dictionary

2010-03-19 Thread Jeff Zhang
Erick's right, use the terms in index. On Fri, Mar 19, 2010 at 5:05 PM, michaelnazaruk wrote: > > Help me, please! Where I can only buy good spell dictionary? > > -- > View this message in context: > http://old.nabble.com/good-spell-dictionary-tp27950854p27950921.html > Sent from the Solr - Use

Re: good spell dictionary

2010-03-19 Thread michaelnazaruk
Help me, please! Where I can only buy good spell dictionary? -- View this message in context: http://old.nabble.com/good-spell-dictionary-tp27950854p27950921.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Omitting norms question

2010-03-19 Thread Marc Sturlese
>>Should I include not omit-norms on any fields that I would like to boost via a boost-query/function >>query? You don't have to set norms to use boost queries or functions. Just have to set them when you want to boost docs or fields at indexing time. >>What about sortable fields? Facetable field