IndexSearcher and Caches

2010-05-21 Thread Rahul R
Hello all, I have a few questions w.r.t the caches and the IndexSearcher available in solr. I am using solr 1.3. - The solr wiki states that the caches are per IndexSearcher object i.e if I set my filterCache size to 1000 it means that 1000 entries can be assigned for every IndexSearcher object.

Re: How real-time are Soir/Lucene queries?

2010-05-21 Thread Ben Eliott
You may wish to look at Lucandra: http://github.com/tjake/Lucandra On 21 May 2010, at 06:12, Walter Underwood wrote: Solr is a very good engine, but it is not real-time. You can turn off the caches and reduce the delays, but it is fundamentally not real-time. I work at MarkLogic, and we

Re: How real-time are Soir/Lucene queries?

2010-05-21 Thread Ben Eliott
Further to earlier note re Lucandra. I note that Cassandra, which Lucandra backs onto, is 'eventually consistent', so given your real- time requirements, you may want to review this in the first instance, if Lucandra is of interest. On 21 May 2010, at 06:12, Walter Underwood wrote:

Re: How real-time are Solr/Lucene queries?

2010-05-21 Thread Thomas J. Buhr
Thanks for the new information. Its really great to see so many options for Lucene. In my scenario there are the following pieces: 1 - A local Java client with an embedded Solr instance and its own local index/s. 2 - A remote server running Solr with index/s that are more like a repository

Re: IndexSearcher and Caches

2010-05-21 Thread MitchK
Rahul, the IndexSearcher of Solr gets shared with every request within two commits. That means one IndexSearcher + its caches got a lifetime of one commit. After every commit, there will be a new one created. The cache does not mean, that they are applied automatically. They mean, that a filter

Re: Solr 1.4 Enterprise Search Server book examples

2010-05-21 Thread Stefan Moises
Hi, everybody who owns the book can now download the source code examples again, the zip file is fixed now - just got a message from Packt! :) https://www.packtpub.com/support?nid=4191 Have fun :) Cheers, Stefan Am 06.05.2010 16:15, schrieb Antonello Mangone: I had the same problem and I

Re: Solr 1.4 Enterprise Search Server book examples

2010-05-21 Thread Johan Cwiklinski
Hello, Le 21/05/2010 13:29, Stefan Moises a écrit : Hi, everybody who owns the book can now download the source code examples again, the zip file is fixed now - just got a message from Packt! :) https://www.packtpub.com/support?nid=4191 Have fun :) Cheers, Stefan I've received the

Wildcard queries

2010-05-21 Thread Sascha Szott
Hi folks, what's the idea behind the fact that no text analysis (e.g. lowercasing) is performed on wildcarded search terms? In my context this behaviour seems to be counter-intuitive (I guess that's the case in the majority of applications) and my application needs to lowercase any input

Re: Wildcard queries

2010-05-21 Thread Robert Muir
we can use stemming as an example: lets say your query is c?ns?st?nt?y how will this match consistently, which the porter stemmer transforms to 'consistent'. furthermore, note that i replaced the vowels with ?'s here. The porter stemmer doesnt just rip stuff off the end, but attempts to guess

Re: Wildcard queries

2010-05-21 Thread Smiley, David W.
I absolutely consider this a bug too. Cast your vote: https://issues.apache.org/jira/browse/SOLR-219 ~ David On May 21, 2010, at 10:11 AM, Sascha Szott wrote: Hi folks, what's the idea behind the fact that no text analysis (e.g. lowercasing) is performed on wildcarded search terms? In

Re: Wildcard queries

2010-05-21 Thread Sascha Szott
Hi Robert, thanks, you're absolutely right. I should better refine my initial question to: What's the idea behind the fact that no *lowercasing* is performed on wildcarded search terms if the field in question contains a LowercaseFilter in its associated field type definition? -Sascha

Re: Wildcard queries

2010-05-21 Thread Robert Muir
this lowercasing can 'sort of work' (depending on your analysis, and even language, not all case folding is as simple as english). But the more general problem cannot be a bug, as its mathematically not possible to do with queries like wildcard that allow an infinite language, and non-reversible

Re: Wildcard queries

2010-05-21 Thread Robert Muir
I honestly do not know the rationale behind this in Solr, except to say similar problems exist even if you reduce the scope to just casing: For example, if you are using a german stemmer, it will case-fold ß to 'ss' (such that it will match SS). So doing some lowercasing at query-time will not

Re: Wildcard queries

2010-05-21 Thread Smiley, David W.
On May 21, 2010, at 10:35 AM, Robert Muir wrote: I honestly do not know the rationale behind this in Solr, except to say similar problems exist even if you reduce the scope to just casing: Then why are you talking about stemming in the following example? We know stemming is problematic with

Re: Wildcard queries

2010-05-21 Thread Robert Muir
On Fri, May 21, 2010 at 10:40 AM, Smiley, David W. dsmi...@mitre.org wrote: Then why are you talking about stemming in the following example?  We know stemming is problematic with wildcard searching.  But casing... I argue not. I just mentioned an example stemmer that properly case-folds

Re: How real-time are Soir/Lucene queries?

2010-05-21 Thread Dennis Gearon
Did your successor choose Solr? I seem to have read an article or seen a 'mobcast' whre the Search Engine Guy (SEG) @ Netflix used Solr. (Or, maybe ite as another video chain) Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat,

Re: How real-time are Soir/Lucene queries?

2010-05-21 Thread Walter Underwood
I chose it, and it doesn't look like they've replaced it in the eight months since I left. At the time, I was the entire search engineering department, so it was me. wunder On May 21, 2010, at 8:49 AM, Dennis Gearon wrote: Did your successor choose Solr? I seem to have read an article or

Re: Personalized Search

2010-05-21 Thread Rih
It will likely be what you suggested, one or two multi value fields. But with 10,000+ members, does Solr scaled with this schema? On Thu, May 20, 2010 at 6:27 PM, findbestopensource findbestopensou...@gmail.com wrote: Hi Rih, You going to include either of the two field bought or like to

Re: Personalized Search

2010-05-21 Thread Rih
Well, it's not really a recommendation engine per se but more of a filter for the user. Say, I already own some stuff from the result set, I just want to exclude them from the results. What I'm concerned with is reindexing the document everytime someone marks/votes/likes/boughts. On Thu, May 20,

Re: Personalized Search

2010-05-21 Thread Geert-Jan Brits
Just want to throw this in: If you're worried about scaling, etc. you could take a look at item-based collaborative filtering instead of user based. i.e: DO NIGHTLY/ BATCH: - calculate the similarity between items based on their properties DO ON EACH REQUEST - have a user store/update it's

Re: Which Solr to use?

2010-05-21 Thread Jim Blomo
On Tue, May 18, 2010 at 12:31 PM, Sixten Otto six...@sfko.com wrote: So features are being actively added to / code rearranged in trunk/4.0, with some of the work being back-ported to this branch to form a stable 3.1 release? Is that accurate? Is there any thinking about when that might drop

Re: Which Solr to use?

2010-05-21 Thread Chris Hostetter
: Is there any thinking about when that might drop (beyond the quite : understandable when it's done)? Or, perhaps more reasonably, when it : might freeze? FWIW: I have no idea ... it's all a question of when someone takes charge on the release process -- quite frankly, so much is in flux

Re: Special Circumstances for embedded Solr

2010-05-21 Thread Ryan McKinley
Any other commonly compelling reasons to use SolrJ? The most compelling reason (I think) is that if you program against the Solrj API, you can switch between embedded/http/streaming implementations without changing anything. This is great for our app that is either run as a small local

Re: Moving from Lucene to Solr?

2010-05-21 Thread Ryan McKinley
On Wed, May 19, 2010 at 6:38 AM, Peter Karich peat...@yahoo.de wrote: Hi all, while asking a question on stackoverflow [1] some other questions appear: Is SolrJ a recommended way to access Solr or should I prefer the HTTP interface? solrj vs HTTP interface? That will just be a matter of

DataImportHandler and running out of disk space

2010-05-21 Thread wojtekpia
I'm noticing some data differences between my database and Solr. About a week ago my Solr server ran out of disk space, so now I'm observing how the DataImportHandler behaves when Solr runs out of disk space. In a word, I'd say it behaves badly! It looks like out-of-disk-space exceptions are

Re: DIH post import event listener for errors

2010-05-21 Thread Robert Zotter
I have a similar need so I've opened up a ticket: http://issues.apache.org/jira/browse/SOLR-1922 Should be pretty trivial to add. -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-post-import-event-listener-for-errors-tp834645p835132.html Sent from the Solr - User

Re: Personalized Search

2010-05-21 Thread dc tech
In our specific case, we would get the user's folders and then do a function query that provides a boost if the document.folder is in {my folder list}. Another approach that will work for our intranet use is to add the userids in a multi-valued field as others have suggested. On 5/20/10,

Re: Tipps for develop a own RequestHandler ?!

2010-05-21 Thread Chris Hostetter
: I would write an own RH for my system. is an howto in the www ? i didnt : found anythin about it. I would start by looking at how existing RequestHandlers are implemented -- the ones that ship with Solr are heavily refactored to reuse a lot of functionality, which can sometimes make it hard

Re: Statistics exposed as JSON

2010-05-21 Thread Chris Hostetter
: Are the Solr 1.4 statistics like #docs, #docsPending etc. exposed in : JSON format? if you are refering to hte output from stats.jsp, then no -- that is not available in JSON format in Solr 1.4. In future versions of Solr a new RequestHandle will replace stats.jsp (and regsitry.jsp) making

No hits returned from shard search on multi-core setup

2010-05-21 Thread TonyBray
I cannot get hits back and do not get a correct total number of records when using shard searching. I have 2 cores, core0 and core1. Both have the same schema.xml and solrconfig.xml (different datadirs in solrconfig.xml). Our id field contains globally unique id's across both cores, but they use

RE: Non-English query via Solr Example Admin corrupts text

2010-05-21 Thread Chris Hostetter
: I wanted to improve the documentation in the solr wiki by adding in my : findings. However, when I try to log in and create a new account, I : receive this error message: : : You are not allowed to do newaccount on this page. Login and try again. : : Does anyone know how I can get permission

Re: Personalized Search

2010-05-21 Thread dc tech
Excluding favorited items is an easier problem - get the results - get exclude list from db - scan results and exclude the items in the item list You'd have to do some code to manage 'holes' in the result list ie fetch more etc. You could marry this with the solr batch based approach to reduce

Re: TikaEntityProcessor on Solr 1.4?

2010-05-21 Thread Sixten Otto
2010/5/19 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@gmail.com: I guess it should work because Tika Entityprocessor does not use any new 1.4 APIs On Wed, May 19, 2010 at 1:17 AM, Sixten Otto six...@sfko.com wrote: The TikaEntityProcessor class that enables DataImportHandler to process business

RE: Non-English query via Solr Example Admin corrupts text

2010-05-21 Thread Chris Hostetter
This should be fixed now -- please update the Jira issue if you have any other problems creating an account. : Hmmm... yes, there definitely seems to be a problem with creating new wiki : accounts on wiki.apache.org -- i've opened an issue with INFRA... : :

RE: seemingly impossible query

2010-05-21 Thread Nagelberg, Kallin
I just realized something that may make the fieldcollapsing strategy insufficient. My 'ids' field is multi-valued. From what I've read you cannot field collapse on a multi-valued field. Any other ideas? Thanks, -Kallin Nagelberg -Original Message- From: Geert-Jan Brits

Full Import failed

2010-05-21 Thread Mohamed Parvez
I am getting this error, any hint as where i should look SEVERE: Full Import failed org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NoSuchMethodError: isEmpty at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:424) at

Re: Full Import failed

2010-05-21 Thread Paul Libbrecht
Last I encountered that exception was with the usage of String.isEmpty which is a 1.6 novelty. Can it be you've been running 1.5? paul Le 21-mai-10 à 22:44, Mohamed Parvez a écrit : SEVERE: Full Import failed org.apache.solr.handler.dataimport.DataImportHandlerException:

Re: TikaEntityProcessor on Solr 1.4?

2010-05-21 Thread Chris Harris
You are right that TikaEntityProcessor has a couple of other prereqs beyond stock Solr 1.4. I think the main point is that they're relatively minor. I've merged TikaEntityProcessor (and some prereqs) and its dependencies into my Solr 1.4 tree and it compiles fine, though I haven't yet tested that

Re: DataImportHandler and running out of disk space

2010-05-21 Thread wojtekpia
I ran through some more failure scenarios (scenarios and results below). The concerning ones in my deployment are when data does not get updated, but the DIH's .properties file does. I could only simulate that scenario when I ran out of disk space (all all disk space issues behaved consistently).

Re: Full Import failed

2010-05-21 Thread Mohamed Parvez
yes i am running 1.5, Any idea how we can run Solr 1.4 using Java 1.5 --- Thanks/Regards, Parvez On Fri, May 21, 2010 at 4:17 PM, Paul Libbrecht p...@activemath.org wrote: Last I encountered that exception was with the usage of String.isEmpty which is a 1.6 novelty. Can it be you've been

Re: TikaEntityProcessor on Solr 1.4?

2010-05-21 Thread Sixten Otto
On Fri, May 21, 2010 at 5:30 PM, Chris Harris rygu...@gmail.com wrote: Actually, rather than cherry-pick just the changes from SOLR-1358 and SOLR-1583 what I did was to merge in all DataImportHandler-related changes from between the 1.4 release up through Solr trunk r890679 (inclusive). I'm

Re: Full Import failed

2010-05-21 Thread Paul Libbrecht
Fixing that precise line is very easy, and recompiling is easy as well. But I am absolutely not sure this will be the only occurrence of a 1.6 dependency. paul Le 21-mai-10 à 23:40, Mohamed Parvez a écrit : yes i am running 1.5, Any idea how we can run Solr 1.4 using Java 1.5 ---

field collapsing on multi-valued field

2010-05-21 Thread Nagelberg, Kallin
As I understand from looking at https://issues.apache.org/jira/login.jsp?os_destination=/browse/SOLR-236 field collapsing has been disabled on multi-valued fields. Is this really necessary? Let's say I have a multi-valued field, 'my-mv-field'. I have a query like (my-mv-field:1 OR

RE: Solr 1.4 Enterprise Search Server book examples

2010-05-21 Thread Robert Risley
I downloaded the examples and unzipped into C:\Examples C:\Examples\3 C:\Examples\7 C:\Examples\8 C:\Examples\9 C:\Examples\cores C:\Examples\solr Starting in the C:\Examples\solr folder run command 'java -jar start.jar' and it starts ok, but all the URI's return 404. I can get Solr running with

SolrJ/EmbeddedSolrServer

2010-05-21 Thread Ken Krugler
I've got a situation where my data directory (a) needs to live elsewhere besides inside of Solr home, (b) moves to a different location when updating indexes, and (c) setting up a symlink from solr_home/data isn't a great option. So what's the best approach to making this work with SolrJ?

Re: DIH post import event listener for errors

2010-05-21 Thread Robert Zotter
Added a patch on the latest trunk: http://issues.apache.org/jira/browse/SOLR-1922 -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-post-import-event-listener-for-errors-tp834645p835704.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: DIH post import event listener for errors

2010-05-21 Thread David Smiley (@MITRE.org)
I'd consider using the logging framework. I do this with Log4j in other apps. Its a generic approach that works for just about any system. ~ David Smiley - Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book -- View this message in context:

RE: Solr 1.4 Enterprise Search Server book examples

2010-05-21 Thread David Smiley (@MITRE.org)
Hello Rob, Thank you for buying the book. I'm the lead author. There is a README.txt file in the root of the zip which includes a rather full invocation of java to kick off Solr that is to be used for the example data. The options as part of the invocation should elucidate what's going on.