spell-checker on solr

2008-07-01 Thread dudes dudes
Hello all, Does spell-checking done automatically under solr or it needs to be coded in ? any docs on this topic ? thanks for your valuable time. ak _ All new Live Search at Live.com

Re: Rsyncd start and stop for multiple instances

2008-07-01 Thread Jacob Singh
Hi Bill and Others: Bill Au wrote: The rsyncd-start scripts gets the data_dir path from the command line and create a rsyncd.conf on the fly exporting the path as the rsync module named solr. The salves need the data_dir path on the master to look for the latest snapshot. But the rsync

First version of solr javascript client to review

2008-07-01 Thread Matthias Epheser
Hi community, as described here http://www.nabble.com/Announcement-of-Solr-Javascript-Client-to17462581.html I started to work on a javascript widget library for solr. I've now finished the basic framework work: - creating a jquery environment - creating helpers for jquery inheritance -

Re: Limit Porter stemmer to plural stemming only?

2008-07-01 Thread Guillaume Smet
Hi Cuong, On Tue, Jul 1, 2008 at 4:45 AM, climbingrose [EMAIL PROTECTED] wrote: I modified the original English Stemmer written in Snowball language and regenerate the Java implementation using Snowball compiler. It's been working for me so far. I certainly can share the modified Snowball

Re: spell-checker on solr

2008-07-01 Thread Shalin Shekhar Mangar
You asked a similar question a few days ago and you got a few links in the reply. Did you try looking into them? On Tue, Jul 1, 2008 at 1:04 PM, dudes dudes [EMAIL PROTECTED] wrote: Hello all, Does spell-checking done automatically under solr or it needs to be coded in ? any docs on this

RE: spell-checker on solr

2008-07-01 Thread dudes dudes
yes you are quite right ! please accept my apologies too much going on my head ! thanks anyway Date: Tue, 1 Jul 2008 14:33:42 +0530 From: [EMAIL PROTECTED] To: solr-user@lucene.apache.org Subject: Re: spell-checker on solr You asked a similar

Distribution and restarting jetty

2008-07-01 Thread Jacob Singh
Hi, I just managed ot hack the distribution scripts a little so that I can specify a different rsyncd module so that I can have multiple indexes rsyncing from the same server on the same port! yeh! Okay, so I'm very excited that it's almost working, but I have one pretty huge issue. Everytime I

Best practices for permissions in DistrobutionScripts

2008-07-01 Thread Jacob Singh
Hey, Sorry to bug everyone again in my newbieness, but this is a quick one, I promise :) I'm running a master and a slave, both on debian using jetty6 (from deb) jetty6 runs under user jetty which has no group. It writes files as jetty.nogroup 664. This means my data directory is 664. jetty

Re: Limit Porter stemmer to plural stemming only?

2008-07-01 Thread climbingrose
Attached is the modified Snowball source code for plural-only English stemmer. You need to compile it to Java using instruction here: http://snowball.tartarus.org/runtime/use.html. Essentially, you need to: 1) Download (Snowball, algorithms, and libstemmer

Re: Rsyncd start and stop for multiple instances

2008-07-01 Thread Bill Au
You can either use a dedicated rsync port for each instance or hack the existing scripts to support multiple rsync modules. Both ways should work. Bill On Tue, Jul 1, 2008 at 3:49 AM, Jacob Singh [EMAIL PROTECTED] wrote: Hi Bill and Others: Bill Au wrote: The rsyncd-start scripts gets

Slow deleteById request

2008-07-01 Thread Renaud Delbru
Hi, We experience very slow delete, taking more than 10 seconds. A delete is executed using deleteById (from Solrj or from curl), at the same time documents are being added. By looking at the log (below), it seems that a delete by ID request is only executed during the next commit (done

Solr / Tomcat bottleneck while parsing http headers

2008-07-01 Thread Christophe Fondacci
Hello all, I've searched the web and this forum without finding any answer to the problem I have... So here it is : Our application performs queries to solr. Our application is deployed on a Tomcat 6.0.14 on machine A. Solr is deployed on a dedicated Tomcat 6.0.14 server running on a distinct

starting solr hangs

2008-07-01 Thread Umar Shah
Hi, I am experiencing a strange behavior while using the default example solr/ jetty container on a hosted ubuntu 7.10 linux machine after running java -jar start.jar from the example folder, it just outputs 2 lines and hangs 2008-07-01 09:46:35.299::INFO: Logging to STDERR via

DataImportHandler - combined DataSource possible?

2008-07-01 Thread Jon Baer
Hi, Is it currently possible to define a db-data-config.xml to include both a HttpDataSource and a JDBCDataSource @ all? I can't tell if this is possible or not (although it seems that dataConfig might only take a single dataSource child element. Thanks. - Jon

Re: DataImportHandler - combined DataSource possible?

2008-07-01 Thread Shalin Shekhar Mangar
Hi Jon, Yes it is possible. Define two dataSources in the data config file and use them like this: entity name=one dataSource=datasource-1 .. /entity entity name=two dataSource=datasource-2 .. /entity On Tue, Jul 1, 2008 at 7:49 PM, Jon Baer [EMAIL PROTECTED] wrote: Hi, Is it currently

Re: Slow deleteById request

2008-07-01 Thread Renaud Delbru
Small precision, we are using a nightly build of Solr 1.3 (one of the nightly build just before the integration of Lucene 2.4). -- Renaud Delbru Renaud Delbru wrote: Hi, We experience very slow delete, taking more than 10 seconds. A delete is executed using deleteById (from Solrj or from

Re: DataImportHandler - combined DataSource possible?

2008-07-01 Thread Noble Paul നോബിള്‍ नोब्ळ्
There is a section with this information http://wiki.apache.org/solr/DataImportHandler#head-138482af9d5c5e9600e60b4135c3eb41d8b34098 --Noble On Tue, Jul 1, 2008 at 8:08 PM, Shalin Shekhar Mangar [EMAIL PROTECTED] wrote: Hi Jon, Yes it is possible. Define two dataSources in the data config file

Re: DataImportHandler - combined DataSource possible?

2008-07-01 Thread Lucas F. A. Teixeira
DIH, aka, thank-you-god-this-dih-saved-my-life []s, Lucas Lucas Frare A. Teixeira [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Tel: +55 11 3660.1622 - R3018 Shalin Shekhar Mangar escreveu: Hi Jon, Yes it is possible. Define two dataSources in the data config file and use them like this:

Solr Capabilities/Limitations

2008-07-01 Thread Willie Wong
Hi y’all, I’m a newbie to Solr, and was looking for advice on whether Solr is the best choice for this project. I need to be able to search through terabytes of existing data. Documents may vary in size from 10 MB to 20 KB in size. Also at some point I’ll also need to feed in approximately

negative boosting / analysis?

2008-07-01 Thread Ryan McKinley
Hi- I'm working on a case where we have review text that may include words that describe what the item is *not*. Given the text the kitten is not clean, searching for clean should not include (at least at the top) the kitten. The approach I am considering is to copy the text to a

Re: Slow deleteById request

2008-07-01 Thread Yonik Seeley
That's very strange... are you sending a commit with the delete perhaps? If so, the whole request would block until a new searcher is registered. -Yonik On Tue, Jul 1, 2008 at 8:54 AM, Renaud Delbru [EMAIL PROTECTED] wrote: Hi, We experience very slow delete, taking more than 10 seconds. A

dataimporter last_index_something

2008-07-01 Thread Jeremy Hinegardner
Hi all, I'm using the dataimport patch, and it is working wonderfully. I had one question though. Is there a way to pick some other last_index value that could be selected on a per entity basis? Right now it is just last_index_time which is great, except for a table that doesn't have modified

Slow performance using MatchAllDocsQuery with filter query

2008-07-01 Thread Guangwei Yuan
Hi, I've noticed some bad performance in faceted browsing, when the query is empty (so the MatchAllDocsQuery is used) and there are only filter queries. An example of the search url is: http://hostname:8080/solr/select/?q=qt=dismaxfq=color:%2300 One idea is to switch to the StandardRequest

Re: Slow deleteById request

2008-07-01 Thread Renaud Delbru
Hi Yonik, We are not sending a commit with a delete. It happens when using the following command: curl http://mydomain.net:8080/index/update -s -H 'Content-type:text/xml; charset=utf-8' -d deleteidhttp://example.org//id/delete or using the SolrJ deleteById method (that does not execute a

Re: Slow deleteById request

2008-07-01 Thread Yonik Seeley
On Tue, Jul 1, 2008 at 4:05 PM, Renaud Delbru [EMAIL PROTECTED] wrote: We are not sending a commit with a delete. It happens when using the following command: curl http://mydomain.net:8080/index/update -s -H 'Content-type:text/xml; charset=utf-8' -d deleteidhttp://example.org//id/delete or

Proposed Solr architecture - does this make sense?

2008-07-01 Thread Todd Breiholz
Hi all New to Solr/Lucene. Our current search is done with Verity and we are looking to move towards open-source products. Our first application would have less than 500,000 documents indexed at the outset. Additions/updates to the index would occur at 2,000-3,000 per minute. We are currently

Re: Slow performance using MatchAllDocsQuery with filter query

2008-07-01 Thread Mike Klaas
On 1-Jul-08, at 12:25 PM, Guangwei Yuan wrote: I've noticed some bad performance in faceted browsing, when the query is empty (so the MatchAllDocsQuery is used) and there are only filter queries. An example of the search url is:

Re: Solr Capabilities/Limitations

2008-07-01 Thread Mike Klaas
On 1-Jul-08, at 8:37 AM, Willie Wong wrote: I need to be able to search through terabytes of existing data. Documents may vary in size from 10 MB to 20 KB in size. Also at some point I’ll also need to feed in approximately approximately 1-5 million new documents a day. This depends

Re: Slow deleteById request

2008-07-01 Thread Renaud Delbru
Yonik Seeley wrote: I'd try the latest nightly solr build... it now lets Lucene manage the deletes. Yes, updating to a newer version of nightly Solr build could solve the problem, but I am a little afraid to do it since solr-trunk has switched to lucene 2.4-dev. Thanks for your answers,

Solr* != solr*

2008-07-01 Thread George Aroush
Hi Folks, Can someone tell me what I might have setup wrong? After indexing my data, I can search just fine on, let say sol* but not on Sol* (note upper case 'S' vs. lower case 's') I get 0 hits. Here is my customize schema.xml setting: fieldType name=text class=solr.TextField

Re: Solr* != solr*

2008-07-01 Thread Erik Hatcher
George - wildcard expressions, in Lucene/Solr's QueryParser, are not analyzed. There is one trick in the API that isn't yet wired to Solr's configuration, and that is setLowercaseExpandedTerms(true). This would solve the Sol* issue because when indexed all terms for the text field are

Re: dataimporter last_index_something

2008-07-01 Thread Noble Paul നോബിള്‍ नोब्ळ्
Currently there is nothing . There is a hackish way to achieve it. DIH allows to read values from request params and use it in the templates. eg: query=select * from atable where id ${dataimporter.request.last_id} so, DIH must be invoked with the extra request param last_id like this

Re: Slow deleteById request

2008-07-01 Thread Chris Hostetter
: Yes, updating to a newer version of nightly Solr build could solve the : problem, but I am a little afraid to do it since solr-trunk has switched to : lucene 2.4-dev. but did you check wether or not you have maxPendingDeletes configured as yonik asked? That would explain exactly waht you are