Re: Explicitly tell Solr the analyzed value when indexing a document

2011-11-17 Thread Ahmet Arslan
> I have a couple of string fields. For some of them I want > from my > application to be able to index a lowercased string but > store the > original value. Is there some way to do this? Or would I > have to come > up with a new field type and implement an analyzer? If you have stored="true" in y

Re: Explicitly tell Solr the analyzed value when indexing a document

2011-11-17 Thread Tim Terlegård
>> I have a couple of string fields. For some of them I want >> from my >> application to be able to index a lowercased string but >> store the >> original value. Is there some way to do this? Or would I >> have to come >> up with a new field type and implement an analyzer? > > If you have stored="

Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
Hi, I was trying to configure a Solr instance with the near real-time search and auto-complete capabilities. I stuck in the NRT feature. There are 15 new records per second that inserted into the database (mysql) and I indexed them with DIH. First, I tried to manage autoCommits from solrconfig.xml

Re: Explicitly tell Solr the analyzed value when indexing a document

2011-11-17 Thread Tim Terlegård
> I have a couple of string fields. For some of them I want from my > application to be able to index a lowercased string but store the > original value. Is there some way to do this? Or would I have to come > up with a new field type and implement an analyzer? I think I should be able to do what

Re: Explicitly tell Solr the analyzed value when indexing a document

2011-11-17 Thread Ahmet Arslan
> I want for a string field faceting to return "monkey" while > the > original value is *Monkey". So I want indexed be lowercase > and stored > the original value. That is, I want to do the analyzing in > my > application and tell Solr what to use for indexed and what > to use for > stored. Sorry

Re: Highlighting apostrophe

2011-11-17 Thread rychu
Hi, have you found the solution to your "highlighting apostrophe" problem? -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-apostrophe-tp731155p3515139.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: delta-import of rich documents like word and pdf files!

2011-11-17 Thread neuron005
Thank you for your replies guys.that helped a lot. Thanks "iorixxx" that was the command that worked out. I also tried my solr with mysql and that worked too. Congo! :) Now, I want to index my files according to their size and facet them according to their size ranges. I know that t

Re: to prevent number-of-matching-terms in contributing score

2011-11-17 Thread Samarendra Pratap
On Thu, Nov 17, 2011 at 6:59 AM, Chris Hostetter wrote: > > : 1. "omitTermFreqAndPositions" is very straightforward but if I avoid > : positions I'll refuse to serve phrase queries. I had searched for this in > > but do you really need phrase queries on your "cat" field? i thought the > point wa

Re: delta-import of rich documents like word and pdf files!

2011-11-17 Thread Ahmet Arslan
> Now, I want to index my files according to their size and > facet them > according to their size ranges. I know that there is an > option of "fileSize" > in FileListEntityProcessor but I am not getting any way to > perform this. > Is fileSize a metadata? You don't need a dynamic field for this.

Re: delta-import of rich documents like word and pdf files!

2011-11-17 Thread neuron005
Thanks for your reply, I performed these steps. in data-config.xml : in schema.xml : -- But still there is no response in browse sectionI edited facet_r

Re: delta-import of rich documents like word and pdf files!

2011-11-17 Thread neuron005
And also I set my fileSize of type long. "String" will not work I think ! Size can not be a string...it shows error on using string as type. -- View this message in context: http://lucene.472066.n3.nabble.com/delta-import-of-rich-documents-like-word-and-pdf-files-tp3502039p3515505.html Sent from

Re: delta-import of rich documents like word and pdf files!

2011-11-17 Thread neuron005
I ran this command and can see size of my files http://localhost:8080/solr/select?q=user&f.fileSize.facet.range.start=100 Great thanks...string worked...i dont know why that did not work last time But when I do that in browse section..following output i saw in my logs SEVERE: Exception during

Solr Master High Availability

2011-11-17 Thread KARHU Toni
Hi, im looking into High availability SOLR master configurations. Does anybody have a good solution to this the things im looking into are: * Using SOLR replication to keep a second backup master. * Indexing in a separate machine(s), problem being here that the index will be differen

ISO8601 Date format

2011-11-17 Thread Gerke, Axel
Hello, due a different Bug in another system, we stored a date in a datefield with an value like "999-12-31T23:00:00Z". As you can see in the schema browser below, solr stores it correct with four digits but in a response the leading zero is missing. My question is: is a three digit year a valid

Re: Aggregated indexing of updating RSS feeds

2011-11-17 Thread sbarriba
Thanks Chris. (Bell rings) The 'params' logging pointer was what I needed. So for reference its not a good idea to use a 'wget' command directly in a crontab. I was using: wget http://localhost/solr/myfeed?command=full-import&rows=5000&clean=false ...but moving this into a separate shell script

Re: Aggregated indexing of updating RSS feeds

2011-11-17 Thread Michael Kuhlmann
Am 17.11.2011 11:53, schrieb sbarriba: The 'params' logging pointer was what I needed. So for reference its not a good idea to use a 'wget' command directly in a crontab. I was using: wget http://localhost/solr/myfeed?command=full-import&rows=5000&clean=false :)) I think the shell handled the

Re: Problems with AutoSuggest feature(Terms Components)

2011-11-17 Thread Erick Erickson
TermsComponent only reacts to what you send it. How are these requests getting to the TermsComponent? That's where you should look. As far as terms.limit, your requesthandler for TermsComponent in solrconfig.xml has a section and you can set whatever you want in there and then override it as you

Re: Phrase between quotes with dismax edismax

2011-11-17 Thread Jean-Claude Dauphin
Thanks Erick for your prompt response. I am not sure but I think I found why the phrase "chef de projet" is not found by dismax and edismax. The following terms are indexed and can be seen with Luke: chef projet chef de projet When searching for the phrase "chef de projet", the terms 'che

Re: delta-import of rich documents like word and pdf files!

2011-11-17 Thread neuron005
Sorry for disturbing you allactually I had to add plong instead of type string. My problem is solved Be ready for new thread CHEERS -- View this message in context: http://lucene.472066.n3.nabble.com/delta-import-of-rich-documents-like-word-and-pdf-files-tp3502039p3515711.html Sent from t

Re: ISO8601 Date format

2011-11-17 Thread Gora Mohanty
On Thu, Nov 17, 2011 at 6:06 PM, Gerke, Axel wrote: > Hello, > > due a different Bug in another system, we stored a date in a datefield > with an value like "999-12-31T23:00:00Z". As you can see in the schema > browser below, solr stores it correct with four digits but in a response > the leading

FunctionQuery score=0

2011-11-17 Thread John
Hi, I am using a function query that based on the query of the user gives a score for the results I am presenting. Some of the results are receiving score=0 in my function and I would like them not to appear in the search results. How can I achieve that? Thanks in advance.

Re: strange behavior of scores and term proximity use

2011-11-17 Thread Erick Erickson
Hmmm, I'm not seeing similar behavior on a trunk from today, when did you get your copy? Erick On Wed, Nov 16, 2011 at 2:06 PM, Ariel Zerbib wrote: > Hi, > > For this term proximity query: ab_main_title_l0:"to be or not to be"~1000 > > http://localhost:/solr/select?q=ab_main_title_l0%3A%22og

Re: Phrase between quotes with dismax edismax

2011-11-17 Thread Erick Erickson
OK, looks like you're mixing fieldTypes. That is, you have some "string" types, which are completely unanalyzed and some analyzed fields. The analyzed fields have stopwords removed at index time. Then it looks like your query chain does NOT remove stopwords or some such. So it's probably a schema

What is the best approach to do reindexing on the fly?

2011-11-17 Thread erolagnab
Hi all, I'm using Solr 3.2 with DataImportHandler periodically update index every 5 min. There's an house keeping script running weekly which delete some data in the database. I'd like to incorporate the reindexing strategy with this house keeping script by: 1. Locking the DataImportHandler - not

Re: FunctionQuery score=0

2011-11-17 Thread Andre Bois-Crettez
John wrote: Some of the results are receiving score=0 in my function and I would like them not to appear in the search results. you can use frange, and filter by score: q=ipod&fq={!frange l=0 incl=false}query($q) -- André Bois-Crettez Search technology, Kelkoo http://www.kelkoo.com/

Re: FunctionQuery score=0

2011-11-17 Thread John
Doesn't seem to work. I though that FilterQueries work before the search is performed and not after... no? Debug doesn't include filter query only the below (changed a bit): BoostedQuery(boost(+fieldName:"",boostedFunction(ord(fieldName),query))) On Thu, Nov 17, 2011 at 5:04 PM, Andre Bois-

RE: memory usage keep increase

2011-11-17 Thread Yongtao Liu
Erick, Thanks for your reply. Yes, "virtual memory" does not mean physical memory. But if when "virtual memory" >> physical memory, the system will change to slow, since lots for paging request happen. Yongtao -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sen

Doubts in Shards concept

2011-11-17 Thread mechravi25
Hi, I have implemented the shards concept. AFter giving the request this is how is given in the logs Nov 15, 2011 10:38:24 PM org.apache.solr.core.SolrCore execute INFO: [core2] webapp=/solr path=/select params={fl=uid,score&start=0&q=abc&isShard=true&wt=javabin&fsv=true&rows=1410&version=1} hits

Migrating from Hibernate Search to Solr

2011-11-17 Thread Ari King
I'm considering migrating from Hibernate Search to Solr, but in order to make that decision, I'd appreciate insight on the following: 1. How difficult is getting Solr up and running? With Hibernate I had to annotate a few classes and setup a config file; so it was pretty easy. 2. How can/should on

Re: Highlighting with a default copy field with EdgeNGramFilterFactory

2011-11-17 Thread João Nelas
I found out the solution! I needed to also add an EdgeNGramFilterFactory to the fields that are the source of the copyField. That got the highlighting working again. -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-with-a-default-copy-field-with-EdgeNGramFilterFac

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Erick Erickson
I guess my first question is what evidence you have that Solr is unable to index fast enough? It's quite possible that your database connection is the thing that's unable to process fast enough. That's certainly a guess, but unless your documents are quite complex, 15 records/second isn't likely t

Re: Migrating from Hibernate Search to Solr

2011-11-17 Thread Erik Hatcher
On Nov 17, 2011, at 10:38 , Ari King wrote: > I'm considering migrating from Hibernate Search to Solr, but in order > to make that decision, I'd appreciate insight on the following: > > 1. How difficult is getting Solr up and running? With Hibernate I had > to annotate a few classes and setup a

Highlighting and regex

2011-11-17 Thread Peter Sturge
Hi, Been wrestling with a question on highlighting (or not) - perhaps someone can help? The question is this: Is it possible, using highlighting or perhaps another more suited component, to return words/tokens from a stored field based on a regular expression's capture groups? What I was kind of

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
Eric, Thank you for your response, 1) I tried 2 new records (records have only 5 field in one table) per second, in 6 sec interval too. It should be quite easy for mysql. But I will check query responses per second as you suggested. 2) I am sure about delta-queries configured well. Full-Import

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Yonik Seeley
On Thu, Nov 17, 2011 at 11:48 AM, Jak Akdemir wrote: > 2) I am sure about delta-queries configured well. Full-Import is completed > in 40 secs for 40 docs. And delta's are in 1 sec for 15 new records. > Also I checked it. There is no problem in it. That's 10,000 docs/sec. If you configure a

Implications of setting catenateAll=1

2011-11-17 Thread Brendan Grainger
Hi, The default for catenateAll is 0 which we've been using on the WordDelimiterFilter. What would be the possibly negative implications of setting this to 1? So that: wi-fi-800 would produce the tokens: wi, fi, wifi, 800, wifi800 for example? Thanks

Re: ISO8601 Date format

2011-11-17 Thread Chris Hostetter
: My question is: is a three digit year a valid ISO-8601 date format for : the response or is this a bug? Because other languages (f.e. python) are : throwing an exception with a three digit year?! There are some known bugs with esoteric years, but i think the one that's burning you here has bee

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
Yonik, I updated my solrconfig time based only as follows. 30 1000 And changed my soft commit script to the first case. while [ 1 ]; do echo "Soft commit applied!" wget -O /dev/null ' http://localhost:8080/solr-jak/dataimport?command=delta-import&commit=fa

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Erick Erickson
Hmmm. It is suspicious that your index files change every second. If you change our cron task to update every 10 seconds, do the index files change every 10 seconds? Regarding your question about "After a server restart last query results reserved. (In NRT they would disappear, right?)" not necess

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
1- There is an improvement on the issue. I add 10 seconds time interval into the delta of data-config.xml, which will cover records that already indexed. "revision_time > DATE_SUB('${dataimporter.last_index_time}', INTERVAL 10 SECOND);" In this case 1369 new records inserted with 7 records per sec

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Yonik Seeley
On Thu, Nov 17, 2011 at 1:34 PM, Erick Erickson wrote: > Hmmm. It is suspicious that your index files change every > second. Why is this suspicious? A soft commit still writes out some files currently... it just doesn't fsync them. -Yonik http://www.lucidimagination.com

Boosting is slow

2011-11-17 Thread Brian Lamb
Hi all, I have about 20 million records in my solr index. I'm running into a problem now where doing a boost drastically slows down my search application. A typical query for me looks something like: http://localhost:8983/solr/mycore/search/?q=test {!boost b=product(sum(log(sum(myfield,1)),1),rec

Re: Boosting is slow

2011-11-17 Thread Brian Lamb
Sorry, the query is actually: http://localhost:8983/solr/mycore/search/?q=test{!boost b=product(sum(log(sum(myfield,1)),1),recip(ms(NOW,mydate_field),3.16e-11,1,8))}&start=&sort=score+desc,mydate_field+desc&wt=xslt&tr=mysite.xsl On Thu, Nov 17, 2011 at 2:59 PM, Brian Lamb wrote: > Hi all, > > I

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
Yonik, Is it ok to see soft committed records after server restart, too? If it is, there is no problem left at all. I added changing files and 1 sec of log at the end of the e-mail. One significant line says softCommit=true, so Solr recognizes our softCommit request. INFO: start commit(optimize=fa

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Yonik Seeley
On Thu, Nov 17, 2011 at 3:56 PM, Jak Akdemir wrote: > Is it ok to see soft committed records after server restart, too? Yes... we currently have Jetty configured to call some cleanups on exit (such as closing the index writer). -Yonik http://www.lucidimagination.com

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
This is great! I guess, there is nothing left to worry about for a while. Erick & Yonik, thank you again for your great responses. Bests, Jak On Thu, Nov 17, 2011 at 4:01 PM, Yonik Seeley wrote: > On Thu, Nov 17, 2011 at 3:56 PM, Jak Akdemir wrote: > > Is it ok to see soft committed records af

Re: Multiple solr webapps

2011-11-17 Thread Chris Hostetter
: According to solr wiki, an instruction to use single war file and : multiple context files (solr[1-2].xml). ... : I wonder why following structure is not enough. I think this is : the simplest way (disk space is a bit more necessary, of course): ...there's nothing stoping you from actu

Re: two word phrase search using dismax

2011-11-17 Thread Chris Hostetter
: After putting the same score for title and content in qf filed, docs : with both words in content moved to fifth place. The doc in the first, : third and fourth places still have only one of the words in content and : title. The doc in the second place has one of the words in title and : bot

Re: FunctionQuery score=0

2011-11-17 Thread Chris Hostetter
: I am using a function query that based on the query of the user gives a : score for the results I am presenting. please be specific -- it's not at all clear what the structure of your query is, and the details matter. : Some of the results are receiving score=0 in my function and I would like

Re: Solr Master High Availability

2011-11-17 Thread Erick Erickson
Look at the repeater setup on the replication page, and instead of "repeater", think "backup master". But you don't really need to even do this. You can simply provision yourself an extra slave. Now, if you master goes south, you can reconfigure any slave as the new master by just putting the confi

Re: What is the best approach to do reindexing on the fly?

2011-11-17 Thread Erick Erickson
Hmmm, the master/slave setup takes about a day to get completely running assuming that you don't have any experience to start with, so you may be able to fit that in your schedule. Otherwise, you won't be able to avoid the memory and CPU spikes. But there's another option. It's actually quite easy

Re: ExtractingRequestHandler HTTP GET Problem

2011-11-17 Thread Chris Hostetter
: indexed file. The CommonsHttpSolrServer sends the parameters as a HTTP : GET request. Because of that I'll get a "socket write error". If I : change the CommonsHttpSolrServer to send the parameters as HTTP POST : sending will work, but the ExtractingRequestHandler will not recognize : the parame

Re: Migrating from Hibernate Search to Solr

2011-11-17 Thread Ari King
> So no Hibernate/Solr glue out there already?   It'd be nice if you could use > Hibernate as you do, but instead of working with the Lucene API directly it > would > use SolrJ.   If this type of glue doesn't already exist, then > that'd be the first step I think. > > Otherwise, you could us

[ANNOUNCEMENT] Second Edition of the First Book on Solr

2011-11-17 Thread Smiley, David W.
Fellow Solr users, I am proud to announce that the book "Apache Solr 3 Enterprise Search Server" is officially published!  This is the second edition of the first book on Solr by me, David Smiley, and my co-author Eric Pugh.  You can find full details about the book, download a free chapter, an