Re: Searching with Wildcards

2008-09-24 Thread Brian Carmalt
Hello all, Sorry I have taken so long to get back to Eriks reply, I used the technique of inserting a ? before the * to get at prototype working. However, if 1.3 does not support this anymore, then I really need to look into alternatives. What would be the scope of the work to implement Erik's

Shingles , min size?

2008-09-24 Thread Norberto Meijome
hi guys, I may have missed it ,but is it possible to tell the solr.ShingleFilterFactory the minimum number of grams to generate per shingle? Similar to NGramTokenizerFactory's minGramSize="3" maxGramSize="3" thanks! B _ {Beto|Norberto|Numard} Meijome "Ask not what's in

Re: Dismax , "query phrases"

2008-09-24 Thread Norberto Meijome
On Wed, 24 Sep 2008 08:34:57 -0700 (PDT) Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > What happens if you change ps from 100 to 1 and comment out that ord function? > > Otis, I think what I am after is what Hoss described in his last paragraph in his reply to your email last year : http://ww

Re: Defining custom schema

2008-09-24 Thread Otis Gospodnetic
You need to create your schema manually. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: con <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, September 25, 2008 1:38:48 AM > Subject: Re: Defining custom schema > > >

Re: Defining custom schema

2008-09-24 Thread con
Ya I know, for a startup i am using that default schema. But can I develop a schema from the table automatically. Or do I need to manually create a schema. Thanks con -- View this message in context: http://www.nabble.com/Defining-custom-schema-tp19645491p19663339.html Sent from the Solr - Us

RE: Solr Using

2008-09-24 Thread Lance Norskog
Do these JSP pages compile under another servlet container? If the JSP pages have Java .15 or Java 1.6 syntax features, they will not compile under Jboss 4.0.2. The jboss 4.0.2 jsp compiler does the Java 1.4 language. I ran into this problem moving from a new tomcat to an older jboss. -Origin

RE: Snappuller taking up CPU on master

2008-09-24 Thread Lance Norskog
rsync has an option to limit the transfer rate. You give a maximum bandwidth for it to use in the transfer. (Please do not post the same thing if you don't get a response.) -Original Message- From: rahul_k123 [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 24, 2008 10:57 AM To: solr

Re: help required: how to design a large scale solr system

2008-09-24 Thread Jon Drukman
Martin Iwanowski wrote: How can I setup to run Solr as a service, so I don't need to have a SSH connection open? The advice that I was given on this very list was to use daemontools. I set it up and it is really great - starts when the machine boots, auto-restart on failures, easy to bring u

Re: java.io.IOException: cannot read directory org.apache.lucene.store.FSDirectory@/home/solr/src/apache-solr-nightly/example/solr/data/index: list() returned null

2008-09-24 Thread Erik Holstad
That is exactly what we are doing now added all the documents to the server in the Map phase of the job and send them all to on reducer, which commits them all. Seems to be working. Thanks Erik On Wed, Sep 24, 2008 at 2:27 PM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: > Erik, > There is littl

Re: java.io.IOException: cannot read directory org.apache.lucene.store.FSDirectory@/home/solr/src/apache-solr-nightly/example/solr/data/index: list() returned null

2008-09-24 Thread Erik Holstad
Hi. Changes made in solConfig were mostly done, after the failure, for example: increasing lucene buffer size, etc. Upgraded today to 1.3.0 but the old version was from 9/1 so a couple of weeks old. Will send in a full traceback asap, just running a job right now, so in a couple of mins. Got th

Re: java.io.IOException: cannot read directory org.apache.lucene.store.FSDirectory@/home/solr/src/apache-solr-nightly/example/solr/data/index: list() returned null

2008-09-24 Thread Otis Gospodnetic
Erik, There is little benefit from having more indexer threads than cores. You have multiple indexers calling commit? I suggest you make only one of them call commit. Or use autoCommit. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Erik

Re: using spellcheckcomponent via solrj

2008-09-24 Thread Grant Ingersoll
Yep. That's exactly it. The spellCheckCompRH was merely an example of how to do the necessary configuration with out screwing up the other examples. On Sep 24, 2008, at 4:22 PM, Jason Rennie wrote: On Wed, Sep 24, 2008 at 4:07 PM, Grant Ingersoll <[EMAIL PROTECTED]>wrote: Just mimic th

Pre-processing text in custom FilterFactory / TokenizerFactory

2008-09-24 Thread Jaco
Hello, I need to work with an external stemmer, which is accessible as a COM object. I managed to integrate this using the com4j library. I tried two scenario's: 1. Create a custom FilterFactory and Filter class for this. The external stemmer is then invoked for every token 2. Create a custom Toke

Re: using spellcheckcomponent via solrj

2008-09-24 Thread Jason Rennie
On Wed, Sep 24, 2008 at 4:07 PM, Grant Ingersoll <[EMAIL PROTECTED]>wrote: > Just mimic the configuration for the spellCheckCompRH in the handler that > you use for querying. Sounds even better. Let me make sure I'm reading you correctly. Is the idea to add lines like this to the requestHandle

Re: java.io.IOException: cannot read directory org.apache.lucene.store.FSDirectory@/home/solr/src/apache-solr-nightly/example/solr/data/index: list() returned null

2008-09-24 Thread Grant Ingersoll
Can you share more about your setup, specifically what you changed in your solrconfig file? What version of Solr (looks like a nightly, but from when)? What did you set auto-commit to be? Can you provide the full stack trace? Also, were you starting fresh when you got the second excepti

Re: java.io.IOException: cannot read directory org.apache.lucene.store.FSDirectory@/home/solr/src/apache-solr-nightly/example/solr/data/index: list() returned null

2008-09-24 Thread Erik Holstad
Otis, The machine we are running on has 4 cores, and that seems to make sense, since running for inserters also failed. So what you are saying is that one inserter uses 1 core? So we can only have as many methods calling the commit() as we have cores? Regards Erik On Wed, Sep 24, 2008 at 12:48 P

Re: using spellcheckcomponent via solrj

2008-09-24 Thread Grant Ingersoll
You're other option is to just add the component to your normal request handler (i.e. select). That was the main goal of writing it as a SearchComponent. This way, you don't have to do a separate query to get spelling results. Just mimic the configuration for the spellCheckCompRH in the h

Re: Indexing Multiple Fields with the Same Name

2008-09-24 Thread Shalin Shekhar Mangar
Is that a mis-spelling? mulitValued="true" On Thu, Sep 25, 2008 at 12:12 AM, KyleMorrison <[EMAIL PROTECTED]> wrote: > > I'm trying to index fields as such: >6100966 >375010 >2338917 >1943701 >1357528 >3301821 >2450046 >8940112 >6251457 >293 >62627

Re: using spellcheckcomponent via solrj

2008-09-24 Thread Jason Rennie
On Wed, Sep 24, 2008 at 3:43 PM, Erik Hatcher <[EMAIL PROTECTED]>wrote: > query.setQueryType("/spellCheckCompRH") > That's the trick I needed. Thanks! Jason

Re: java.io.IOException: cannot read directory org.apache.lucene.store.FSDirectory@/home/solr/src/apache-solr-nightly/example/solr/data/index: list() returned null

2008-09-24 Thread Otis Gospodnetic
Erik, Not answering your question directly, but how many cores does your Solr machine have? If it has 2 cores, for example, then running 6 indexers against it likely doesn't make indexing faster. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message >

Re: using spellcheckcomponent via solrj

2008-09-24 Thread Erik Hatcher
On Sep 24, 2008, at 3:22 PM, Jason Rennie wrote: I've got SpellCheckComponent working on my index using queries like so: /solr/spellCheckCompRH? q=shart&spellcheck.q=shart&spellcheck=true&qt=sfdismax But, I haven't had any luck getting solrj to produce such queries. I can't find any way

java.io.IOException: cannot read directory org.apache.lucene.store.FSDirectory@/home/solr/src/apache-solr-nightly/example/solr/data/index: list() returned null

2008-09-24 Thread Erik Holstad
We are using Solr out of the box, with only a couple of changes in the solconfig file. We are running a MapReduce job to import into Solr. Every map creates one document and used to add and commit it to Solr. We got org.apache.solr.common.SolrException: Error_opening_new_searcher_exceeded_limit_of

using spellcheckcomponent via solrj

2008-09-24 Thread Jason Rennie
I've got SpellCheckComponent working on my index using queries like so: /solr/spellCheckCompRH?q=shart&spellcheck.q=shart&spellcheck=true&qt=sfdismax But, I haven't had any luck getting solrj to produce such queries. I can't find any way to change the url from /solr/select to /solr/spellCheckCom

Re: Querying multicore

2008-09-24 Thread Erik Hatcher
Quite possible to do cross-core querying in a custom request handler (or search component). Distributed search is one example of this, but I have encountered applications being architected with cross-core querying in other ways as well. Erik On Sep 24, 2008, at 2:38 PM, Jérôme E

Indexing Multiple Fields with the Same Name

2008-09-24 Thread KyleMorrison
I'm trying to index fields as such: 6100966 375010 2338917 1943701 1357528 3301821 2450046 8940112 6251457 293 6262769 2693214 2839489 6283093 2666401 6343085 1721838 6377309 3882429 6302075 And in the xml schema

Querying multicore

2008-09-24 Thread Jérôme Etévé
Hi everyone, I'm planning to use the multicore cause it seems more convenient than having multiple instances of solr in the same container. I'm wondering if it's possible to query different cores ( hence different schemas / searchers ... indices !) from a customized SolrRequestHandler to buil

Re: Snappuller taking up CPU on master

2008-09-24 Thread rahul_k123
Any Ideas??? rahul_k123 wrote: > > Hi, > > Thanks for the reply. > > I am not using SOLR for indexing and serving search requests, i am using > only the scripts for replication. > > Yes it looks like I/O, but my question is how to handle this problem and > is there any optimal way to achieve

Re: Refresh of synonyms.txt without reload

2008-09-24 Thread Walter Underwood
More details on index-time vs. query-time synonyms are here: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#SynonymFilter wunder On 9/23/08 7:47 AM, "Walter Underwood" <[EMAIL PROTECTED]> wrote: > This is probably not useful because synonyms work better at index time > than at quer

Re: updating synonyms file

2008-09-24 Thread Otis Gospodnetic
Steve, with Solr 1.2 you have to restart. And if you use index-time synonyms you really ought to reindex. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Stephen Weiss <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Wednesday

Re: help required: how to design a large scale solr system

2008-09-24 Thread Otis Gospodnetic
Yatir, I actually think you may be OK with a single machine for 60M docs, though. You should be able to quickly do a test where you use SolrJ to post to Solr and get docs/second. Use SOlr 1.3. Use 2-3 indexing threads going against a single Solr instance. Increase the buffer size param and in

Re: updating synonyms file

2008-09-24 Thread Walter Underwood
I replied to this exact same question yesterday from another Solr user. Please check the mailing list archives. http://www.nabble.com/Refresh-of-synonyms.txt-without-reload-to19629361.html wunder On 9/24/08 8:55 AM, "Stephen Weiss" <[EMAIL PROTECTED]> wrote: > Hi, > > I'm running Solr 1.2, we

Re: Defining custom schema

2008-09-24 Thread Norberto Meijome
On Wed, 24 Sep 2008 04:42:42 -0700 (PDT) con <[EMAIL PROTECTED]> wrote: > In the table we will be having various column names like CUSTOMER_NAME, > CUSTOMER_PHONE etc. If we use the default schema.xml, we have to map these > values to some the default values like cat, features etc. this will cause

Re: help required: how to design a large scale solr system

2008-09-24 Thread Norberto Meijome
On Wed, 24 Sep 2008 11:45:34 -0400 Mark Miller <[EMAIL PROTECTED]> wrote: > Nothing to stop you from breaking up the tsv/csv files into multiple > tsv/csv files. Absolutely agreeing with you ... in one system where I implemented SOLR, I have a process run through the file system and lazily pick

updating synonyms file

2008-09-24 Thread Stephen Weiss
Hi, I'm running Solr 1.2, we are not able to upgrade yet. We've started using synonyms to make the search results better but a few of the synonyms turn out to have unexpected results. We have modifications to our synonyms file that need to go in very quickly but I can't seem to figure o

Re: Dismax , "query phrases"

2008-09-24 Thread Norberto Meijome
On Wed, 24 Sep 2008 08:34:57 -0700 (PDT) Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > What happens if you change ps from 100 to 1 and comment out that ord function? > > > Otis Hi Otis, no luck - without " " : smashing pumpkins smashing pumpkins +((DisjunctionMaxQuery((genre:smash^0.2 | title

Re: help required: how to design a large scale solr system

2008-09-24 Thread Mark Miller
Norberto Meijome wrote: On Wed, 24 Sep 2008 07:46:57 -0400 Mark Miller <[EMAIL PROTECTED]> wrote: Yes. You will def see a speed increasing by avoiding http (especially doc at a time http) and using the direct csv loader. http://wiki.apache.org/solr/UpdateCSV and the obvious reason t

RE: using BoostingTermQuery

2008-09-24 Thread Ensdorf Ken
> I'm no QueryParser expert, but I would probably start w/ the default > query parser in Solr (LuceneQParser), and then progress a bit to the > DisMax one. I'd ask specific questions based on what you see there. > If you get far enough along, you may consider asking for help on the > java-user li

Re: commit

2008-09-24 Thread sunnyfr
Hello, thanks a lot for your answer :) So it should look like : snapshooter true arg1 arg2 MYVAR=val1 snapshooter true and my scripts.conf data_dir=/data/solr/book/data what you think :) Shalin Shekhar Mang

Re: help required: how to design a large scale solr system

2008-09-24 Thread Norberto Meijome
On Wed, 24 Sep 2008 07:46:57 -0400 Mark Miller <[EMAIL PROTECTED]> wrote: > Yes. You will def see a speed increasing by avoiding http (especially > doc at a time http) and using the direct csv loader. > > http://wiki.apache.org/solr/UpdateCSV and the obvious reason that if, for whatever reason,

Re: Dismax , "query phrases"

2008-09-24 Thread Otis Gospodnetic
What happens if you change ps from 100 to 1 and comment out that ord function? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Norberto Meijome <[EMAIL PROTECTED]> > To: SOLR-Usr-ML > Sent: Wednesday, September 24, 2008 11:23:18 AM > Subje

Dismax , "query phrases"

2008-09-24 Thread Norberto Meijome
Hello, I've seen references to this in the list, but not completely explained...my apologies if this is FAQ (and for the length of the email). I am using dismax across a number of fields on an index with data about music albums & songs - the fields are quite full of stop words. I am trying to

Re: Using Shingles to Increase Phrase Search Performance

2008-09-24 Thread Norberto Meijome
On Sat, 16 Aug 2008 15:39:44 -0700 "Chris Harris" <[EMAIL PROTECTED]> wrote: [...] > So finally I modified the Lucene ShingleFilter class to add an > "outputUnigramIfNoNgram option". Basically, if you set that option, > and also set outputUnigrams=false, then the filter will tokenize just > as in

Re: LHS page not found

2008-09-24 Thread Shalin Shekhar Mangar
To use the DataImportHandler, you must add it to your solrconfig.xml Look at the DataImportHandler wiki page for how to do this -- http://wiki.apache.org/solr/DataImportHandler On Wed, Sep 24, 2008 at 7:12 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote: > > Hi ALL, > > When open the > > http://192.16

Re: Lucene index

2008-09-24 Thread Shalin Shekhar Mangar
Hi Dinesh, There are two ways in which you can import data from databases. 1. Use your custom code with the Solrj client library to upload documents to Solr -- http://wiki.apache.org/solr/Solrj 2. Use DataImportHandler and write data-config.xml and custom Transformers -- http://wiki.apache.org/so

Re: commit not fired and no new snapshot ???

2008-09-24 Thread sunnyfr
Any Idea? sunnyfr wrote: > > Hi, > > When I check my commit.log nothings is runned > but my config file seems ok to active my commit : > > > 1 > 1000 > > > My snapshooter too: but no log in snapshooter.log > > > ./data/solr/book/logs/snapshooter

RE: Lucene index

2008-09-24 Thread Dinesh Gupta
Hi Shalin Shekhar, First of all thanks to you for quick replying. I have done the things that you have explained here Since I am creating indexes in multi threads and it takes 6-10 hours to creating for approx. 3 lac products I am using hibernate to access DB & applying custom logic to p

Re: Error running query inside data-config.xml

2008-09-24 Thread Noble Paul നോബിള്‍ नोब्ळ्
just paste the fields in your schema so that we can help you better On Wed, Sep 24, 2008 at 12:33 PM, con <[EMAIL PROTECTED]> wrote: > > > Hi > I havnt changed the schema. For the time being i am simply following the > default schema.xml inside conf directory. > By error I meant no output values.

Re: Getting started with Solr

2008-09-24 Thread Mark Miller
How can I setup to run Solr as a service, so I don't need to have a SSH connection open? Sorry for being stupid here btw. This is kind of independent from solr. You have to look how to do it for the OS you are running on. With Ubuntu, you could just launch solr with nohup to keep it from stop

Getting started with Solr

2008-09-24 Thread Martin Iwanowski
Hi, I'm very new to search engines in general. I've been using Zend_Search_Lucene PHP class before to try Lucene in general and though it surely works it's not what I'm looking for performance wise. I recently installed Solr on a newly installed Ubuntu (Hardy Heron) machine. I have about 207k d

Re: help required: how to design a large scale solr system

2008-09-24 Thread Mark Miller
Yes. You will def see a speed increasing by avoiding http (especially doc at a time http) and using the direct csv loader. http://wiki.apache.org/solr/UpdateCSV - Mark Ben Shlomo, Yatir wrote: Thanks Mark!. Do you have any comment regarding the performance differences between indexing TSV fil

Re: Defining custom schema

2008-09-24 Thread con
In the table we will be having various column names like CUSTOMER_NAME, CUSTOMER_PHONE etc. If we use the default schema.xml, we have to map these values to some the default values like cat, features etc. this will cause difficulty when we need to process the output. Instead can we set the column

Re: help required: how to design a large scale solr system

2008-09-24 Thread Martin Iwanowski
Hi, I'm very new to search engines in general. I've been using Zend_Search_Lucene class before to try Lucene in general and though it surely works it's not what I'm looking for performance wise. I recently installed Solr on a newly installed Ubuntu (Hardy Heron) machine. I have about 20

RE: help required: how to design a large scale solr system

2008-09-24 Thread Ben Shlomo, Yatir
Thanks Mark!. Do you have any comment regarding the performance differences between indexing TSV files as opposed to directly indexing each document via http post? -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 24, 2008 2:12 PM To: solr-user@luce

Re: Defining custom schema

2008-09-24 Thread Mark Miller
con wrote: Hi guys How can I define a custom schema based on my table. Thanks con Give us more info. Is your table oak or pine? Custom how what?

Re: help required: how to design a large scale solr system

2008-09-24 Thread Mark Miller
From my limited experience: I think you might have a bit of trouble getting 60 mil docs on a single machine. Cached queries will probably still be *very* fast, but non cached queries are going to be very slow in many cases. Is that 5 seconds for all queries? You will never meet that on first r

Re: solr score

2008-09-24 Thread Neeti Raj
Hi Santhanaraj Just search for boost on Solr wiki and see if boost feature suffices your requirement. As for highlighting, this explains how to implement solr highlighting http://wiki.apache.org/solr/HighlightingParameters - neeti On Wed, Sep 24, 2008 at 10:31 AM, sanraj25 <[EMAIL PROTECTED]> wr

Defining custom schema

2008-09-24 Thread con
Hi guys How can I define a custom schema based on my table. Thanks con -- View this message in context: http://www.nabble.com/Defining-custom-schema-tp19645491p19645491.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Error running query inside data-config.xml

2008-09-24 Thread Shalin Shekhar Mangar
Looking at your data-config.xml, you are trying to index two columns both of which are being sent to "features" field. The output from the data-config.xml shows that it added 152 documents. Try using the match all query *:* which should show all documents in the index. You will need to modify the

Re: Solr Using

2008-09-24 Thread Shalin Shekhar Mangar
What is the syntax error? Which JSP? Please give the stack trace too. On Wed, Sep 24, 2008 at 12:24 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote: > > Which version of tomcat required. > > I installed jboss4.0.2 which have tomcat5.5.9. > > JSP pages are not going to compile. > > Its giving syntax er

Re: EmbeddedSolrServer and the MultiCore functionality

2008-09-24 Thread Aleksander M. Stensby
Okay, sounds fair. Well, why I would have multiple shards was based on the presumption that it would be more effective to be able to search in single shards when needed (if each shard contains lets say 30 million entries) and then when time comes, migrate one of the shards to a different nod

commit not fired

2008-09-24 Thread sunnyfr
Hi, When I check my commit.log nothings is runned but my config file seems ok to active my commit : 1 1000 My snapshooter too: but no log in snapshooter.log ./data/solr/book/logs/snapshooter data/solr/book/bin true arg1 arg2

postOptimize not firing in 1.3

2008-09-24 Thread Jarek Zgoda
I have postoptimize hook defined as follows: /home/jzgoda/solr-master/solr/bin/snapshooterstr> /home/jzgoda/solr-master/solr/bin true But I cann't see it is firing, either if I issue request from my application nor running optimize script from solr distribution.

AW: AW: Date field mystery

2008-09-24 Thread Kolodziej Christian
Hi Chris, that's even more cloak-and-dagger... In the meantime we edited out index and use a unix timestamp, that's working without any problems :-) Thanks for your help and have a nice day, Christian

Re: Error running query inside data-config.xml

2008-09-24 Thread con
Hi I havnt changed the schema. For the time being i am simply following the default schema.xml inside conf directory. By error I meant no output values. But when I run http://localhost:8983/solr/dataimport?command=full-import, it shows that: 0 0