Re: Highlighting does not work?

2009-01-29 Thread Jarek Zgoda
Added appriopriate amendment to FAQ, but I'd consider reorganizing information in the whole wiki, like creating a section titled Common Tasks. Bit of redundancy does not hurt if it comes to documentation. Wiadomość napisana w dniu 2009-01-28, o godz. 20:01, przez Mike Klaas: Well, both

Re: newbie question --- multiple schemas

2009-01-29 Thread Noble Paul നോബിള്‍ नोब्ळ्
have two different cores and you can have separate schema for each. On Thu, Jan 29, 2009 at 1:20 PM, Cheng Zhang zhangyongji...@yahoo.com wrote: Hello, Is it possible to define more than one schema? I'm reading the example schema.xml. It seems that we can only define one schema? What about

Registration for ApacheCon Europe 2009 is now open!

2009-01-29 Thread Erik Hatcher
Cross-posting this announcement. There are several relevant Lucene/ Solr talks including: Trainings - Lucene Boot Camp (Grant Ingersoll) - Solr Boot Camp (Erik Hatcher) Sessions - Introducing Apache Mahout (Grant) - Lucene Case Studies (Erik) - Advanced Indexing Techniques with

Re: query with stemming, prefix and fuzzy?

2009-01-29 Thread Gert Brinkmann
Gert Brinkmann wrote: A) fuzzy search What can I do to speed up the fuzzy query? Setting ramBufferSizeMB to a higher value seems to speed up the query slightly. I have to continue with tuning though. B) combine stemming, prefix and fuzzy search Is there a way to combine all this three

Re: WebLogic 10 Compatibility Issue - StackOverflowError

2009-01-29 Thread Mark Miller
We should get this on the wiki. - Mark Ilan Rabinovitch wrote: We were able to deploy Solr 1.3 on Weblogic 10.0 earlier today. Doing so required two changes: 1) Creating a weblogic.xml file in solr.war's WEB-INF directory. The weblogic.xml file is required to disable Solr's filter on

Re: WebLogic 10 Compatibility Issue - StackOverflowError

2009-01-29 Thread Alexander Ramos Jardim
Ilan, I had the same problem some months ago and had to remove the quoted line on jsp. But I never got the other problem you said with 1.3 in Weblogic. 2009/1/29 Ilan Rabinovitch i...@fonz.net We were able to deploy Solr 1.3 on Weblogic 10.0 earlier today. Doing so required two changes:

RE: DIH handling of missing files

2009-01-29 Thread Nathan Adams
I'm running the example from the DIH wiki page: http://wiki.apache.org/solr-data/attachments/DataImportHandler/attachments/example-solr-home.jar -Nathan From: Noble Paul ??? ?? [mailto:noble.p...@gmail.com] Sent: Wed 01/28/2009 11:32 PM To:

RE: DIH handling of missing files

2009-01-29 Thread Nathan Adams
Which appears to be v1.3, which explains the problem. Thanks! From: Nathan Adams [mailto:na...@umich.edu] Sent: Thu 01/29/2009 8:28 AM To: solr-user@lucene.apache.org Subject: RE: DIH handling of missing files I'm running the example from the DIH wiki page:

fuzzy search and uppercased word. finds moo~ not Moo~

2009-01-29 Thread Julian Davchev
Hi, I am doing fuzzy search. And works correctly. For some reason though it has problems with uppercase words. e.g if I search moo~I get results but if I do Moo~ I don't. I see in analyzer that LowerCaseFilterFactory is hitting but I gess with fuzzy it's getting messy. Any clue someone?

Data Directory Sync.

2009-01-29 Thread Kalidoss MM
Hi, I have a requirement like, There is a running solr and having around 10K records indexed in it. Now i have to index another set of 30K records? The 10K data already in live, And i dont have an option to insert that 30K records in live, Is there any way to run the solr

Re: multilanguage + howto search in all languages?

2009-01-29 Thread Julian Davchev
Thank you both for points. For now I am hanlding with fuzzy search. Let's hope this will do for sometime :) Walter Underwood wrote: I've done this. There are five cases for the tokens in the search index: 1. Tokens that are unique after stemming (this is good). 2. Tokens that are common

Re: Data Directory Sync.

2009-01-29 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Thu, Jan 29, 2009 at 7:27 PM, Kalidoss MM mm.kalid...@gmail.com wrote: Hi, I have a requirement like, There is a running solr and having around 10K records indexed in it. Now i have to index another set of 30K records? The 10K data already in live, And i dont have an option to

MASTER / SLAVES numdoc

2009-01-29 Thread sunnyfr
Hi, I've one server and several slaves and I would like to know if I go to the host.name/solr/admin/stat.jsp if there is a way to know the difference of the numDoc per server? Thanks a lot -- View this message in context: http://www.nabble.com/MASTER---SLAVES-numdoc-tp21730748p21730748.html

warmupTime : 0

2009-01-29 Thread sunnyfr
Hi, Do you think it's normal to have warmupTime : 0 ?? searcher class: org.apache.solr.search.SolrIndexSearcher version:1.0 description:index searcher stats: searcherName : searc...@6f7cf6b6 main caching : true numDocs : 8207035 maxDoc : 8239991 readerImpl :

Solr Gaze and Multicore?

2009-01-29 Thread Jacob Singh
Sorry if this is wrong place to ask since Solr Gaze is Lucid's proejct, but I was trying to install this in a multicore environment, and it doesn't seem to be working. It says to add the plugin to solr.home/lib. Which solr.home? I got to /gaze and of course, it doesn't know where to look.

ranged query on multivalued field doesnt seem to work

2009-01-29 Thread zqzuk
Hi all, in my schema I have two multivalued fields as field name=start_year type=sfloat indexed=true stored=true multiValued=true/ field name=end_year type=sfloat indexed=true stored=true multiValued=true/ and I issued a query as: start_year:[400 TO *], the result seems to be incorrect because

Re: query with stemming, prefix and fuzzy?

2009-01-29 Thread Mark Miller
Truncation queries and stemming are difficult partners. You likely have to accept compromise. You can try using multiple fields like you are, you can try indexing the full term at the same position as the stemmed term, or you can accept the weirdness that comes from matching on a stemmed form

Re: I get SEVERE: Lock obtain timed out

2009-01-29 Thread Jon Drukman
Julian Davchev wrote: Hi, Any documents or something I can read on how locks work and how I can controll it. When do they occur etc. Cause only way I got out of this mess was restarting tomcat SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SingleInstanceLock:

Re: Solr Gaze and Multicore?

2009-01-29 Thread Mark Miller
Jacob Singh wrote: Sorry if this is wrong place to ask since Solr Gaze is Lucid's proejct, but I was trying to install this in a multicore environment, and it doesn't seem to be working. It says to add the plugin to solr.home/lib. Which solr.home? I got to /gaze and of course, it doesn't know

Question about rating documents

2009-01-29 Thread Reece
Currently I'm using SOLR 1.2 to index a few million documents. It's been requested that a way for users to rate the documents be done so that something rated higher would show up higher in search results and vice verse. I've been thinking about it, but can't come up with a good way to do this

Re: permanently setting log level?

2009-01-29 Thread Vannia Rajan
On Thu, Jan 29, 2009 at 11:55 PM, Jon Drukman jdruk...@gmail.com wrote: if i go to /solr/admin/logging, i can set the root log level to WARNING, which is what i want. however, every time solr restarts, it is set back to INFO. Is there a way to get the WARNING level to stick permanently?

Re: Question about rating documents

2009-01-29 Thread Matthew Runo
You could use a boost function to gently boost up items which were marked as more popular. You would send the function query in the bf parameter with your query, and you can find out more about syntax here: http://wiki.apache.org/solr/FunctionQuery Thanks for your time! Matthew Runo

Re: How to handle database replication delay when using DataImportHandler?

2009-01-29 Thread Gregg Donovan
Noble, Thanks for the suggestion. The unfortunate thing is that we really don't know ahead of time what sort of replication delay we're going to encounter -- it could be one millisecond or it could be one hour. So, we end up needing to do something like: For delta-import run N: 1. query DB slave

Re: I get SEVERE: Lock obtain timed out

2009-01-29 Thread Yonik Seeley
On Thu, Jan 29, 2009 at 1:16 PM, Jon Drukman jdruk...@gmail.com wrote: Julian, have you had any luck figuring this out? My production instance just started having this problem. It seems to crop up after solr's been running for several hours. Our usage is very light (maybe one query every

Re: Solr Gaze and Multicore?

2009-01-29 Thread Jacob Singh
Hi Mark, Thanks, I've got it working now. Still waiting for the stats to update... This is really cool! I've also been working pretty hard at an automated benchmark suite using jmeter, rightscale and amazon web services. Next time I'm in Boston (March I think), it would be great to show you.

Re: permanently setting log level?

2009-01-29 Thread Jon Drukman
Vannia Rajan wrote: On Thu, Jan 29, 2009 at 11:55 PM, Jon Drukman jdruk...@gmail.com wrote: if i go to /solr/admin/logging, i can set the root log level to WARNING, which is what i want. however, every time solr restarts, it is set back to INFO. Is there a way to get the WARNING level to

Re: Question about rating documents

2009-01-29 Thread Reece
Hmm, I already boost certain fields, but from what I know about it you would need to know the boost value ahead of time which is not possible as it would be a different boost for each document depending on how it was rated.. I did think of one thing though. If I had a field that had a value of

Re: Optimizing Improving results based on user feedback

2009-01-29 Thread Walter Underwood
Thanks, I didn't know there was so much research in this area. Most of the papers at those workshops are about tuning the entire ranking algorithm with machine learning techniques. I am interested in adding one more feature, click data, to an existing ranking algorithm. In my case, I have enough

Re: Question about rating documents

2009-01-29 Thread Erick Erickson
This may not be practical, as it would involve re-indexing all your documents periodically, but here goes anyway... You could think about *index-time* boosts. Somewhere you keep a record of the recommendations, then re-index your corpus adding some suitable boost to each field in your document

Re: Solr 1.3 and spellcheck.onlyMorePopular=true

2009-01-29 Thread Mark Miller
I am not super familiar with the lucene/solr spell checking implementations, but here is my take: By saying to only allow more popular, you are restricting suggestions to only those that have a higher instance frequency in the index. The score is still by edit distance, but only terms with a

Re: Question about rating documents

2009-01-29 Thread Reece
Re-indexing so much would be a pretty big pain. I do have a unique ID for each document though that I use for updating them every day as they change. -Reece On Thu, Jan 29, 2009 at 2:40 PM, Erick Erickson erickerick...@gmail.com wrote: This may not be practical, as it would involve

RE: Solr 1.3 and spellcheck.onlyMorePopular=true

2009-01-29 Thread Nicholas Piasecki
Thanks for this lucid explanation. Indeed, turning the option off seems to give more intelligent results. I think that this was more of an example of me seeing onlyMorePopular and thinking hmm, that must be good! without fully understanding the consequences of the setting. The key point in your

Re: permanently setting log level?

2009-01-29 Thread Vannia Rajan
i'm not using tomcat, i'm using the default jetty setup that comes with solr. i grepped through the entire solr installation for 'INFO' but i don't see it. i don't really know anything about jetty other than i have to run java -jar start.jar to get it to run solr. If you are not using

got background_merge_hit_exception during optimization

2009-01-29 Thread Qingdi
We got the following background_merge_hit_exception during optimization: exception:

Re: warmupTime : 0

2009-01-29 Thread Yonik Seeley
On Thu, Jan 29, 2009 at 12:12 PM, sunnyfr johanna...@gmail.com wrote: Do you think it's normal to have warmupTime : 0 ?? Sure, if the caches were empty or almost empty (say on startup). -Yonik

RE: warmupTime : 0

2009-01-29 Thread Feak, Todd
This usually represents anything less then 8ms if you are on a Windows system. The granularity on timing on Windows systems is around 16ms. -Todd feak -Original Message- From: sunnyfr [mailto:johanna...@gmail.com] Sent: Thursday, January 29, 2009 9:13 AM To: solr-user@lucene.apache.org

Re: got background_merge_hit_exception during optimization

2009-01-29 Thread Otis Gospodnetic
Hi, I didn't look into this deeply, but you didn't say which version of Solr you are using (looks like it might be 1.3). If using a nightly build is an option, you might try that instead - Yonik updated the Lucene jars recently and that might be enough to solve this problem. Otis --

Re: Solr Gaze and Multicore?

2009-01-29 Thread Mark Miller
Jacob Singh wrote: This is really cool! I've also been working pretty hard at an automated benchmark suite using jmeter, rightscale and amazon web services. Next time I'm in Boston (March I think), it would be great to show you. That sounds excellent! One problem with Solr's efficiency is

Re: Highlighting does not work?

2009-01-29 Thread Mike Klaas
Thanks, Jarek. -Mike On 29-Jan-09, at 12:20 AM, Jarek Zgoda wrote: Added appriopriate amendment to FAQ, but I'd consider reorganizing information in the whole wiki, like creating a section titled Common Tasks. Bit of redundancy does not hurt if it comes to documentation. Wiadomość

Re: Optimizing Improving results based on user feedback

2009-01-29 Thread Walter Underwood
A Decision Theoretic Framework for Ranking using Implicit Feedback uses clicks, but the best part of that paper is all the side comments about difficulties in evaluation. For example, if someone clicks on three results, is that three times as good or two failures and a success? We have to know the

Re: How to handle database replication delay when using DataImportHandler?

2009-01-29 Thread Noble Paul നോബിള്‍ नोब्ळ्
Yeah that is an option. On Fri, Jan 30, 2009 at 12:27 AM, Gregg Donovan gregg...@gmail.com wrote: Noble, Thanks for the suggestion. The unfortunate thing is that we really don't know ahead of time what sort of replication delay we're going to encounter -- it could be one millisecond or it