Repliaction in 1.4 Replicate Now works, but scheduled rep. does not

2010-08-30 Thread Leanid
Hello, I am upgrading from 1.3 to 1.4 and setting up new replication method. On master I added this section: requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAftercommit/str str name=replicateAfterstartup/str str

Re: Is there any strss test tool for testing Solr?

2010-08-30 Thread 朱炎詹
Thanks to both Gora Amit. A little information for people who concern this discussion, I found there's a SolrMeter open source project in Google Code - http://code.google.com/p/solrmeter/, it's specifically for load test of Solr - I'll evaluate following tools pick up one for my testing:

Re: Cutom filter implementation, advice needed

2010-08-30 Thread Ingo Renner
Am 26.08.2010 um 21:07 schrieb Ingo Renner: Hi again, I implemented a custom filter and am using it through a QParserPlugin. I'm wondering however, whether my implementation is that clever yet... Here's my QParser; I'm wondering whether I should apply the filter to all documents in the

Filter Query question

2010-08-30 Thread Eric Grobler
Hi Solr Community If you use a filter like: q=*:* fq=make:Volkswagen and then the next query is: q=blue fq=make:Volkswagen will Solr use the filter cache before the main query, or only after a blue subset? In other words will this query make more sense? q=(blue) AND (make:Volkswagen)

Capturing 'SQLException' in Solr-DataImport for my Java code

2010-08-30 Thread kishan
HI all, iam using solr 1.4.0 with java. recently i observed in my solr logs , Because of the invalid userName i got java.sql.SQLException: Access denied for user '1234'@'localhost i resolved this but iam not able to capture this error in my code so that to throw a Proper message to the user .

DIH - deleting documents, high performance (delta) imports, and passing parameters

2010-08-30 Thread Ephraim Ofir
After wasting a few days navigating the somewhat uncharted and murky waters of DIH, thought I'd share my insights with the community to save other newbies time, so here goes... First off, this is not to say DIH is bad, I think it's great and it works really well for my uses, but it has a few

Re: Filter Query question

2010-08-30 Thread Grant Ingersoll
On Aug 30, 2010, at 7:20 AM, Eric Grobler wrote: Hi Solr Community If you use a filter like: q=*:* fq=make:Volkswagen and then the next query is: q=blue fq=make:Volkswagen will Solr use the filter cache before the main query, or only after a blue subset? The first query will

Re: JVM GC is very frequent.

2010-08-30 Thread Grant Ingersoll
Some of it will also depend on things like your caches, heap size, etc. -Grant On Aug 26, 2010, at 12:37 AM, Chengyang wrote: We have about 500million documents are indexed.The index size is aobut 10G. Running on a 32bit box. During the pressure testing, we monitered that the JVM GC is

Affinity ranking

2010-08-30 Thread Ukyo Virgden
Hi, Is there any implementation in solr or lucene for affinity ranking? I've been doing some research for content based ranking models and came across the paper Improving search results using affinity Graph http://research.microsoft.com/apps/pubs/default.aspx?id=67818 Any thoughts? Cheers Ukyo

Expanded Synonyms + phrase search

2010-08-30 Thread Xavier Schepler
Hi, several documents from my index contain the phrase : PS et. However, PS is expanded to parti socialiste and a phrase search for PS et fails. A phrase search for parti socialiste et succeeds. Can I have both queries working ? Here's the field type : fieldtype name=SyFR

Re: Filter Query question

2010-08-30 Thread Eric Grobler
Hi Grant, Thanks for the explanation. Regards ericz On Mon, Aug 30, 2010 at 3:22 PM, Grant Ingersoll gsing...@apache.orgwrote: On Aug 30, 2010, at 7:20 AM, Eric Grobler wrote: Hi Solr Community If you use a filter like: q=*:* fq=make:Volkswagen and then the next query is:

Spatial query

2010-08-30 Thread Anthony Brazton
Hallo everyone, I installed the JTeam solr spatial plugin into Solr 1.4. It seems to work fine except that I am unable to get the calculated distance field back. q={!spatial lat=49.294854 long=8.36869 radius=100 unit=km calc=arc threadCount=2}*:* fl=geo_distance Any help would greatly be

Re: Multiple passes with WordDelimiterFilterFactory

2010-08-30 Thread Shawn Heisey
On 8/29/2010 2:17 PM, Erick Erickson wrote: charFilters are applied even before the tokenizer Try putting this after any instances of, say, WhiteSpaceTokenizerFactory in your analyzser definition, and I believe you'll see that this is not true. At least looking at this in the analysis page from

Re: Multiple passes with WordDelimiterFilterFactory

2010-08-30 Thread Shawn Heisey
On 8/30/2010 9:01 AM, Shawn Heisey wrote: On 8/29/2010 2:17 PM, Erick Erickson wrote: charFilters are applied even before the tokenizer Try putting this after any instances of, say, WhiteSpaceTokenizerFactory in your analyzser definition, and I believe you'll see that this is not true. At

Re: Updating document without removing fields

2010-08-30 Thread Max Lynch
Thanks Lance. I have decided to just put all of my processing on a bigger server along with solr. It's too bad, but I can manage. -Max On Sun, Aug 29, 2010 at 9:59 PM, Lance Norskog goks...@gmail.com wrote: No. Document creation is all-or-nothing, fields are not updateable. I think you

Re: DIH - deleting documents, high performance (delta) imports, and passing parameters

2010-08-30 Thread Tommy Chheng
Thanks for the section on Passing parameters to DIH config: I'm going to try the parameter passing to allow the DIH to index different DBs based on the system environment(local dev machine or production machine) @tommychheng Programmer and UC Irvine Graduate Student Find a great grad school

RTP Apache Lucene/Solr Meetup Sept. 21

2010-08-30 Thread Grant Ingersoll
I'm pleased to announce the very first ever RTP area (Raleigh, Durham, Chapel Hill NC) Lucene/Solr meetup on Sept. 21. The event will be held at Lulu Press and co-sponsored by Lucid Imagination. To learn more and RSVP, please see http://www.meetup.com/RTP-Apache-Solr-Lucene-Meetup/ Hope to

Re: RTP Apache Lucene/Solr Meetup Sept. 21

2010-08-30 Thread Moises Muratalla
please come to the Southern California area On Mon, Aug 30, 2010 at 1:14 PM, Grant Ingersoll gsing...@apache.orgwrote: I'm pleased to announce the very first ever RTP area (Raleigh, Durham, Chapel Hill NC) Lucene/Solr meetup on Sept. 21. The event will be held at Lulu Press and co-sponsored

Distance sorting with spatial filtering

2010-08-30 Thread Scott K
The new spatial filtering (SOLR-1586) works great and is much faster than fq={!frange. However, I am having problems sorting by distance. If I try GET 'http://localhost:8983/solr/select/?q=*:*sort=dist(2,latitude,longitude,0,0)+asc' I get an error: Error 400 can not sort on unindexed field:

Custom scoring

2010-08-30 Thread Brad Kellett
Hi all, I'm looking for examples or pointers to some info on implementing custom scoring in solr/lucene. Basically, what we're looking at doing is to augment the score from a dismax query with some custom signals based on data in fields from the row initially matched. There will be several of

Re: Cutom filter implementation, advice needed

2010-08-30 Thread Ingo Renner
Am 26.08.2010 um 21:07 schrieb Ingo Renner: For those interested and for the Google, I found a working solution myself. The QParser is now down to this: public AccessFilterQParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) {

Re: Search Results optimization

2010-08-30 Thread Lance Norskog
Hi- Here is how it works: Lucene uses TF/DF as the relevance formula. This means term frequency divided by document frequency, or the number of times a term appears in one document over the number of documents that term appears in. This is the basic idea: suppose there are 10 documents say

Re: anybody using solr with Cassandra?

2010-08-30 Thread nickdos
Yes, we are Cassandra. There is nothing much to say really, it just works. Note we are SOLR generating indexes using Java SolrJ (embedded mode) and reading data out of Cassandra with Java. Index generation is fast. -- View this message in context:

Hardware Specs Question

2010-08-30 Thread Amit Nithian
Hi all, I am curious to know get some opinions on at what point having more CPU cores shows diminishing returns in terms of QPS. Our index size is about 8GB and we have 16GB of RAM on a quad core 4 x 2.4 GHz AMD Opteron 2216. Currently I have the heap to 8GB. We are looking to get more servers

Re: Hardware Specs Question

2010-08-30 Thread Lance Norskog
The price-performance knee for small servers is 32G ram, 2-6 SATA disks on a raid, 8/16 cores. You can buy these servers and half-fill them, leaving room for expansion. I have not done benchmarks about the max # of processors that can be kept busy during indexing or querying, and the total

Re: edismax pf2 and ps

2010-08-30 Thread Ron Mayer
Short summary: * Multiple simultaneous phrase boosts with different ps2 parameters are working very nicely for me on a few million doc QA system. * I've submitted an updated patch to Jira incorporating feedback from the jira comments. Will be testing it more this week.

Re: Hardware Specs Question

2010-08-30 Thread Amit Nithian
Lance, Thanks for your help. What do you mean by that the OS can keep the index in memory better than Solr? Do you mean that you should use another means to keep the index in memory (i.e. ramdisk)? Is there a generally accepted heap size/index size that you follow? Thanks Amit On Mon, Aug 30,

Re: Hardware Specs Question

2010-08-30 Thread Lance Norskog
It generally works best to tune the Solr caches and allocate enough RAM to run comfortably. Linux Windows et. al. have their own cache of disk blocks. They use very good algorithms for managing this cache. Also, they do not make long garbage collection passes. On Mon, Aug 30, 2010 at 5:48 PM,

Re: Highlighting, return the matched terms only

2010-08-30 Thread Chris Hostetter
: how could I have the highlighting component return only the terms that were : matched, without any surrounding text ? I'm not a Highlighter expert, but this is somethign that certainly *sounds* like it should be easy. I took a shot at it and this is hte best i could come up with...

Re: Affinity ranking

2010-08-30 Thread Lance Norskog
This is a mass batch-processing task, rather than a search task. Mahout is the right Apache project for implementing this. It would then create a set of (document-document list). You could then add this to a Solr index. (And invert the graph and add those lists.) It might be possible to do this

Re: Hardware Specs Question

2010-08-30 Thread 朱炎詹
I am also curious as Amit does. Can you make an example about the garbage collection problem you mentioned? - Original Message - From: Lance Norskog goks...@gmail.com To: solr-user@lucene.apache.org Sent: Tuesday, August 31, 2010 9:14 AM Subject: Re: Hardware Specs Question It

Re: Hardware Specs Question

2010-08-30 Thread Amit Nithian
Lance, makes sense and I have heard about the long GC times on large heaps but I personally haven't experienced a slowdown but that doesn't mean anything either :-). Agreed that tuning the SOLR caching is the way to go. I haven't followed all the solr/lucene changes but from what I remember

Re: Hardware Specs Question

2010-08-30 Thread Lance Norskog
There are synchronization points, which become chokepoints at some number of cores. I don't know where they cause Lucene to top out. Lucene apps are generally disk-bound, not CPU-bound, but yours will be. There are so many variables that it's really not possible to give any numbers. Lance On