Re: MoreLikeThis throwing NPE

2007-09-10 Thread George L
Looks like the query field has to be stored for MLT. It was failing when i had both query field and similarity fields unstored before. MLT is working fine with this configuration query_field - indexed and stored similarity_field - indexed, unstored and termvectors stored. But why should the que

Solr commit takes too long

2007-09-10 Thread Marius Hanganu
Hi, We're having a problem when commiting to SOLR. Our application commits right after each update - we need the data to be available instantaneously. The index' size is about 166M, Solr has 1024M on a dual quad. The update takes a few milliseconds, but the commit takes about 1 minute. Coul

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Brian Whitman
On Sep 10, 2007, at 1:33 AM, Adrian Sutton wrote: After a while we start getting exceptions thrown because of a timeout in acquiring write.lock. It's quite possible that this occurs whenever two updates are attempted at the same time - is DirectSolrConnection intended to be thread safe?

Re: caching query result

2007-09-10 Thread Jae Joo
Here is the response XML faceted by multiple fields including state. − 0 1782 − -1 10 0 score desc true 1 − duns_number,company_name,phys_state, phys_city, score phys_country:"United States" 2.2 − sales_range total_emp_range company_type phys_state sic1 on On 9/6/07, Y

RE: Solr and KStem

2007-09-10 Thread Wagner,Harry
Yes, I don't think the licensing will be a problem as KStem already includes a wrapper for Lucene. Cheers! harry -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Friday, September 07, 2007 4:40 PM To: solr-user@lucene.apache.org Subject: Re: Solr and KStem Look

quirks with sorting

2007-09-10 Thread David Whalen
Hi All. I'm seeing a weird problem with sorting that I can't figure out. I have a query that uses two fields -- a "source" column and a date column. I search on the source and I sort by the date descending. What I'm seeing is that depending on the value in the source, the date sort works in rev

Re: quirks with sorting

2007-09-10 Thread Yonik Seeley
On 9/10/07, David Whalen <[EMAIL PROTECTED]> wrote: > I'm seeing a weird problem with sorting that I can't figure out. > > I have a query that uses two fields -- a "source" column and a > date column. I search on the source and I sort by the date > descending. > > What I'm seeing is that depending

RE: quirks with sorting

2007-09-10 Thread David Whalen
You know, I must have looked at that date 10 times and I never noticed the year. Sorry everyone! > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf > Of Yonik Seeley > Sent: Monday, September 10, 2007 11:23 AM > To: solr-user@lucene.apache.org > Su

Re: Distribution Information?

2007-09-10 Thread Bill Au
I guess your solr home isn't configured correctly. FYI, you can set master_status_dir to use full path name (ie /opt/solr/logs/clients in your case). Bill On 9/7/07, Matthew Runo <[EMAIL PROTECTED]> wrote: > > OK. I made the change, but it seemed not to pick up the files. > > When I changed dist

My Solr index keeps growing

2007-09-10 Thread Robin Bonin
I had created a new index over the weekend, and the final size was a few hundred megs. I just checked and now the index folder is up to 1.7 Gig. Is this due to results being cached? can I set a limit to how large the index will grow? is there anything else that could be effecting this file size? T

Re: My Solr index keeps growing

2007-09-10 Thread Yonik Seeley
On 9/10/07, Robin Bonin <[EMAIL PROTECTED]> wrote: > I had created a new index over the weekend, and the final size was a > few hundred megs. > I just checked and now the index folder is up to 1.7 Gig. Is this due > to results being cached? can I set a limit to how large the index will > grow? is t

Re: My Solr index keeps growing

2007-09-10 Thread Robin Bonin
Yes I am talking about the files in the solr/data/index folder. So that folder should stay the same size unless documents are added, and I guess commit and optimize are run. I'll have to watch my app and make sure it is not adding some extra stuff to the index I am not aware of. On 9/10/07, Yonik

Re: How to patch

2007-09-10 Thread Mike Klaas
On 9-Sep-07, at 8:57 PM, James liu wrote: i wanna try patch: https://issues.apache.org/jira/browse/SOLR-139? page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel and i download solr1.2 release patch < SOLR-269*.pach(when in '/tmp/apache-solr-1.2.0/src/test/org/apache/solr/u

RE: adding without overriding dups - DirectUpdateHandler2.java does not implement?

2007-09-10 Thread Lance Norskog
I was unclear. Our use case is that for some data sources we submit the same thing over and over. Overwriting deletes the first one and we end up with long commit times, and also we lose the earliest known date for the document. We would like to have the second update attempt dropped. So we would

Re: Solr and KStem

2007-09-10 Thread Mike Klaas
Hi Harry, Thanks for your contribution! Unfortunately, we can't include it in Solr unless the necessary legal hurdles are cleared. An issue needs to be opened on http://issues.apache.org/jira/browse/ SOLR and you have to attach the file and check the "Grant License to ASF" button. It is

Re: New user question: How to show all stored fields in a result

2007-09-10 Thread melkink
Well, I figured out my problem. User error of course ;-) I was processing documents in two separate steps. The first step added the id and the doctext fields. The second step did an update to add the metadata. I didn't realize that an update command replaced the whole document rather than jus

Re: New user question: How to show all stored fields in a result

2007-09-10 Thread Mike Klaas
On 10-Sep-07, at 11:54 AM, melkink wrote: The other change I made (which may or may not have contributed to the solution) was to remove all line breaks from the text being submitted to the doctext field. The line breaks were causing solr to interpret the text as having multiple values and

Re: Solr and KStem

2007-09-10 Thread Yonik Seeley
Some other notes: I just read the license... it's nice and short, and appears to be ASL compatible to me. We could either include the source in Solr and build it, or add it as a pre-compiled jar into lib. The FilterFactory should probably have it's package changed to org.apache.solr.analysis (defin

RE: Solr and KStem

2007-09-10 Thread Wagner,Harry
Hi Yonik and Mike, No problem regarding my employer. I've checked and they are happy to contribute it. I'm not sure what to do about the KStem code though. It was originally written by Bob Krovetz and then modified for Lucene by Sergio Guzman-Lara (both from UMASS Amherst). I modified the Guzma

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Adrian Sutton
We use DirectSolrConnection via JNI in a couple of client apps that sometimes have 100s of thousands of new docs as fast as Solr will have them. It would crash relentlessly if I didn't force all calls to update or query to be on the same thread using objc's @synchronized and a message queue

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Yonik Seeley
On 9/10/07, Adrian Sutton <[EMAIL PROTECTED]> wrote: > Can Solr as a web app handle multiple updates at > once or does it synchronize to avoid it? Yep... things aren't synchronized at the top level and are designed to be thread-safe. -Yonik

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Mike Klaas
On 10-Sep-07, at 1:50 PM, Adrian Sutton wrote: We use DirectSolrConnection via JNI in a couple of client apps that sometimes have 100s of thousands of new docs as fast as Solr will have them. It would crash relentlessly if I didn't force all calls to update or query to be on the same thread

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Brian Whitman
On Sep 10, 2007, at 5:00 PM, Mike Klaas wrote: On 10-Sep-07, at 1:50 PM, Adrian Sutton wrote: We use DirectSolrConnection via JNI in a couple of client apps that sometimes have 100s of thousands of new docs as fast as Solr will have them. It would crash relentlessly if I didn't force all

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Ryan McKinley
The other problem is that after some time we get a "Too Many Open Files" error when autocommit fires. Have you checked your ulimit settings? http://wiki.apache.org/lucene-java/LuceneFAQ#head-48921635adf2c968f7936dc07d51dfb40d638b82 ulimit -n . As mike mentioned, you may also want to use 's

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Adrian Sutton
On 11/09/2007, at 7:21 AM, Ryan McKinley wrote: The other problem is that after some time we get a "Too Many Open Files" error when autocommit fires. Have you checked your ulimit settings? http://wiki.apache.org/lucene-java/ LuceneFAQ#head-48921635adf2c968f7936dc07d51dfb40d638b82 ulimit -

Removing lengthNorm from the calculation

2007-09-10 Thread Kyle Banerjee
I know I'm missing something really obvious, but I'm spinning my wheels figuring out how to eliminate lengthNorm from the calculations. The specific problem I'm trying to solve is that naive queries are resulting in crummy short records near the top of the list. The reality is that the longer reco

Re: Removing lengthNorm from the calculation

2007-09-10 Thread Yonik Seeley
If you aren't using index-time document boosting, or field boosting for that field specifically, then set omitNorms="true" for that field in the schema, shut down solr, completely remove the index, and then re-index. The norms for each field consist of the index-time boost multiplied by the length

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Ryan McKinley
Adrian Sutton wrote: On 11/09/2007, at 7:21 AM, Ryan McKinley wrote: The other problem is that after some time we get a "Too Many Open Files" error when autocommit fires. Have you checked your ulimit settings? http://wiki.apache.org/lucene-java/LuceneFAQ#head-48921635adf2c968f7936dc07d51dfb4

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Adrian Sutton
On 11/09/2007, at 8:46 AM, Ryan McKinley wrote: lucene opens a lot of files. It can easily get beyond 1024. (I think the default). I'm no expert on how the file handling works, but I think more files are open if you are searching and writing at the same time. If you can't increase the li

Re: Removing lengthNorm from the calculation

2007-09-10 Thread Mike Klaas
On 10-Sep-07, at 3:31 PM, Kyle Banerjee wrote: I know I'm missing something really obvious, but I'm spinning my wheels figuring out how to eliminate lengthNorm from the calculations. The specific problem I'm trying to solve is that naive queries are resulting in crummy short records near the to

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Ryan McKinley
I've done a bit of poking on the server and ulimit doesn't seem to be the problem: e2wiki:~$ ulimit unlimited e2wiki:~$ cat /proc/sys/fs/file-max 170355 try: ulimit -n ulimit on its own is something else. On my machine I get: [EMAIL PROTECTED]:~$ ulimit unlimited [EMAIL PROTECTED]:~$ cat /

Re: New user question: How to show all stored fields in a result

2007-09-10 Thread Erik Hatcher
On Sep 10, 2007, at 3:07 PM, Mike Klaas wrote: On 10-Sep-07, at 11:54 AM, melkink wrote: The other change I made (which may or may not have contributed to the solution) was to remove all line breaks from the text being submitted to the doctext field. The line breaks were causing solr to in

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Adrian Sutton
On 11/09/2007, at 9:48 AM, Ryan McKinley wrote: try: ulimit -n ulimit on its own is something else. On my machine I get: [EMAIL PROTECTED]:~$ ulimit unlimited [EMAIL PROTECTED]:~$ cat /proc/sys/fs/file-max 364770 [EMAIL PROTECTED]:~$ ulimit -n 1024 I have to run: ulimit -n 2 to get luce

Re: Solr and KStem

2007-09-10 Thread Bill Fowler
Hello, I would like to test this and have a few questions (please excuse what may seem naive questions). I would like to verify that this is purely a configuration feature -- since the schema.xml defines the analysis/tokerizer chain no other changes are required. Also, the source seems to say th