Function queries question

2009-11-20 Thread Oliver Beattie
Hi all, I'm a relative newcomer to Solr, and I'm trying to use it in a project of mine. I need to do a function query (I believe) to filter the results so they are within a certain distance of a point. For this, I understand I should use something like sqedist or hsin, and from the documentation

RE: schema-based Index-time field boosting

2009-11-20 Thread Ian Smith
Hi David, thanks for replying, The field boost attribute was put there by me back in the 1.3 days, when I somehow gained the mistaken impression that it was supposed to work! Of course, despite a lot of searching I haven't been able to find anything to back up my position ;) Unfortunately our

Re: Solr - Load Increasing.

2009-11-20 Thread kalidoss
Thank u all. I have increased the heap size memory from 1gb to 1.5gb. Now its java -Xms512M -Xmx1536M -jar start.jar, My cpu load is normal and solr is not restating frequently, My autocommit maxdoc increased to 200. For last 24 hours no issue on load/restarts. Thanks Guys.

field type definition

2009-11-20 Thread revas
Hello, If I define a field like this in the schema ,is this correct ? fieldType name=*text_match_phrase* class=*solr.TextField*positionIncrementGap =*100* - http://sites.google.com/a/impelsys.com/search/phrase-match# analyzer tokenizer class=*solr.WhitespaceTokenizerFactory* / filter

Re: Function queries question

2009-11-20 Thread Grant Ingersoll
On Nov 20, 2009, at 3:15 AM, Oliver Beattie wrote: Hi all, I'm a relative newcomer to Solr, and I'm trying to use it in a project of mine. I need to do a function query (I believe) to filter the results so they are within a certain distance of a point. For this, I understand I should use

Re: field type definition

2009-11-20 Thread Grant Ingersoll
On Nov 20, 2009, at 7:22 AM, revas wrote: Hello, If I define a field like this in the schema ,is this correct ? fieldType name=*text_match_phrase* class=*solr.TextField*positionIncrementGap =*100* - http://sites.google.com/a/impelsys.com/search/phrase-match# analyzer tokenizer

Re: Filtering query results

2009-11-20 Thread Grant Ingersoll
On Nov 19, 2009, at 4:59 PM, aseem cheema wrote: Hey Guys, I need to filter out some results based on who is performing the search. In other words, if a document is not accessible to a user performing search, I don't want it to be in the result set. What is the best/easiest way to do this

Solr index on multiple drives.

2009-11-20 Thread swatkatz
Hi, Can I have one instance of Solr write the index and date to multiple drives ? e.g. Can I configure Solr to do something like - dataDirc:\data/dataDir dataDird:\data/dataDir dataDire:\data/dataDir Or is the suggested way to use multiple Solr cores and have the application shard the index

RE: Multi word synonym problem

2009-11-20 Thread Nair, Manas
Hi, I tried using the recommended approach but to no benefit. The multiword synonyms are still not appearing in the result. My schema.xml has the following fieldType: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer

Re: Upgrade to solr 1.4

2009-11-20 Thread kalidoss
Even i want to upgrade from v1.3 to 1.4 I did 1.3 index directory replace with 1.4 and associated schema changes in that. Its throwing lot of exception like datatype mismatch with Integer, String, Date, etc. Even the results are coming with some error example: str

Solr Cell text extraction

2009-11-20 Thread Ian Smith
Hi Guys, I am trying to use Solr Cell to extract body content from documents, and also to pass along some literal field values. Trouble is, some of the literal fields contain spaces, colons etc. which cause a bad request exception in the server. However, if I URL encode these fields the

Re: Upgrade to solr 1.4

2009-11-20 Thread kalidoss
In version 1.3 EventDate field type is date, In 1.4 also its date But we are getting the following error. str name=EventDateERROR:SCHEMA-INDEX-MISMATCH,stringValue=2008-05-16T07:19:28/str -kalidoss.m, kalidoss wrote: Even i want to upgrade from v1.3 to 1.4 I did 1.3 index directory

creating Lucene document from an external XML file.

2009-11-20 Thread Phanindra Reva
Hello All, I am a newbie using Solr and Lucene. In my task, I have to create org.apache.lucene.document.Document objects from external valid Solr xml files.To be brief, depending on the names of the fields I need to modify corresponding values which is specific to our project. So I

RE: Solr Cell text extraction - non-issue

2009-11-20 Thread Ian Smith
Sorry guys, the bad request seemed to be caused elsewhere, no need to URL encode now. Ian. -Original Message- From: Ian Smith [mailto:ian.sm...@gossinteractive.com] Sent: 20 November 2009 15:26 To: solr-user@lucene.apache.org Subject: Solr Cell text extraction Hi Guys, I am trying to

Re: How to use DataImportHandler with ExtractingRequestHandler?

2009-11-20 Thread javaxmlsoapdev
did you extend DIH to do this work? can you share code samples. I have similar requirement where I need tp index database records and each record has a column with document path so need to create another index for documents (we allow users to search both index separately) in parallel with reading

RE: Index documents with Solr

2009-11-20 Thread javaxmlsoapdev
Glock, did you get this approach to work? let me know. Thanks, Glock, Thomas wrote: I have a similar situation but not expecting any easy setup. Currently the tables contain both a url to the file and quite a bit of additional metadata about the file. I'm planning one initial load to

RE: Filtering query results

2009-11-20 Thread Glock, Thomas
Hi Aseem - I had a similar challenge. The solution that works for my case was to add role as a repeating string value in the solr schema. Each piece of content contains 1 or more roles and these values are supplied to solr for indexing. Users also have one or more roles (which correspond

Re: Upgrade to solr 1.4

2009-11-20 Thread Yonik Seeley
On Fri, Nov 20, 2009 at 10:26 AM, kalidoss kalidoss.muthuramalin...@sifycorp.com wrote: In version 1.3 EventDate field type is date, In 1.4 also its date But we are getting the following error. Use the schema you had with 1.3 and it should work. The example schemas are not backward compatible

Default sort order for filter query

2009-11-20 Thread Mike
When I do a search using q=*:* and then narrow down the result set using a filter query, are there rules that are used for the sort order in the result set? In my results I have a name field that appears to be sorted descending in lexicographical order. For example: docstr

Re: Default sort order for filter query

2009-11-20 Thread Mike
Mike wrote: When I do a search using q=*:* and then narrow down the result set using a filter query, are there rules that are used for the sort order in the result set? In my results I have a name field that appears to be sorted descending in lexicographical order. For example: docstr

Re: Default sort order for filter query

2009-11-20 Thread Yonik Seeley
On Fri, Nov 20, 2009 at 11:15 AM, Mike mpiluson...@comcast.net wrote: Sorry for the noise - I think I have just answered my own question. The order in which docs are indexed determine the result sort order unless overridden via sort query parameters :) Correct. The internal lucene document id

comparing index-time boost and sort in the case of a date field

2009-11-20 Thread Anil Cherian
Hi, I have a requirement to get results in the order of latest date of a field called approval_dt. ie results having the latest approval date should appear first in the SOLR results xml. A sorting desc on approval_dt gave me this. Can index-time boost be of use here to improve performance. Could

Re: Default sort order for filter query

2009-11-20 Thread Mike
Yonik Seeley wrote: On Fri, Nov 20, 2009 at 11:15 AM, Mike mpiluson...@comcast.net wrote: Sorry for the noise - I think I have just answered my own question. The order in which docs are indexed determine the result sort order unless overridden via sort query parameters :) Correct.

Re: Default sort order for filter query

2009-11-20 Thread Yonik Seeley
On Fri, Nov 20, 2009 at 11:28 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, Nov 20, 2009 at 11:15 AM, Mike mpiluson...@comcast.net wrote: Sorry for the noise - I think I have just answered my own question. The order in which docs are indexed determine the result sort order unless

Re: Solr 1.3 query and index perf tank during optimize

2009-11-20 Thread Michael
Hoss, Using Solr 1.4, I see constant index growth until an optimize. I commit (hundreds of updates) every 5 minutes and have a mergefactor of 10, but every 50 minutes I don't see the index collapse down to its original size -- it's slightly larger. Over the course of a week, the index grew from

Re: Solr 1.3 query and index perf tank during optimize

2009-11-20 Thread Yonik Seeley
On Fri, Nov 20, 2009 at 12:24 PM, Michael solrco...@gmail.com wrote: So -- I thought I understood you to mean that if I frequently merge, it's basically the same as an optimize, and cruft will get purged.  Am I misunderstanding you? That only applies to the segments involved in the merge. The

Re: Filtering query results

2009-11-20 Thread aseem cheema
Thank you much for your responses guys. I do not have ACL. I need to make a web service call to find out if a user has access to a document. I was hoping to get search results, call the web service with the IDs from the search results telling me what IDs the user has access to, and then filter

Re: Solr 1.3 query and index perf tank during optimize

2009-11-20 Thread Michael
On Fri, Nov 20, 2009 at 12:35 PM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, Nov 20, 2009 at 12:24 PM, Michael solrco...@gmail.com wrote: So -- I thought I understood you to mean that if I frequently merge, it's basically the same as an optimize, and cruft will get purged.  Am I

Re: Solr 1.3 query and index perf tank during optimize

2009-11-20 Thread Yonik Seeley
On Fri, Nov 20, 2009 at 2:32 PM, Michael solrco...@gmail.com wrote: On Fri, Nov 20, 2009 at 12:35 PM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, Nov 20, 2009 at 12:24 PM, Michael solrco...@gmail.com wrote: So -- I thought I understood you to mean that if I frequently merge, it's

Huge load and long response times during search

2009-11-20 Thread Tomasz Kępski
Hi, I'm using SOLR(1.4) to search among about 3,500,000 documents. After the server kernel was updated to 64bit system has started to suffer. Our server has 8G of RAM and double Intel Core 2 DUO. We used to have average loads around 2-2,5. It was not as good as it should but as long HTTP

Re: creating Lucene document from an external XML file.

2009-11-20 Thread Otis Gospodnetic
Hi, If I understand you correctly, you really want to be constructing SolrInputDocuments (not Lucene's Documents) and indexing those with SolrJ. I don't think there is anything in the API that can read in an XML file and convert it into a SolrInputDocuments instance, but aren't there libraries

Re: Solr index on multiple drives.

2009-11-20 Thread Otis Gospodnetic
Hi, No, dataDir is a single directory, so limited to single partition on a single drive. But, you can always have disks in RAID, and then it could be spread over multiple drives. Yes, if you have multiple Solr cores and multiple drives, you could put them on different drivers for performance

Re: Using DirectSolrConnection with Solrj

2009-11-20 Thread Lance Norskog
DirectSolrConnection is older and has not been changed in a year. SolrJ is the preferred way to code an app against Solr. SolrJ with the Embedded server will have the same performance characteristics as DirectSolrConnection. On Thu, Nov 19, 2009 at 5:55 AM, dipti khullar dipti.khul...@gmail.com

Re: Control DIH from PHP

2009-11-20 Thread Lance Norskog
Nice! I didn't notice that before. Very useful. 2009/11/19 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com: you can pass the uniqueId as a param and use it in a sql query http://wiki.apache.org/solr/DataImportHandler#Accessing_request_parameters. --Noble On Thu, Nov 19, 2009 at 3:53 PM,

Re: Problem with SolrJ driver for Solr 1.4

2009-11-20 Thread Lance Norskog
Yes, these are both bugs. SolrJ should do field lists right, and distributed search should work exactly the same as normal search. Please file these in the JIRA. On Thu, Nov 19, 2009 at 8:32 AM, Asaf work a...@dapper.net wrote: Hi, I'm using the SolrJ 1.4 client driver in a sharded Solr

Re: getting total index size last update date/time from query

2009-11-20 Thread Lance Norskog
solr/admin/stats.jsp gives a much larger XML dump and also includes these two data items. Note that Luke can walk the entire index data structures, so if you have a large index it's like playing with fire. On Thu, Nov 19, 2009 at 8:54 AM, Binkley, Peter peter.bink...@ualberta.ca wrote: The Luke

Re: index-time boost ... query

2009-11-20 Thread Lance Norskog
No, the reverse is true. Sorting is very very fast in Lucene. The first sort operation spends a lot of time making a data structure and then following sort calls use it. On Thu, Nov 19, 2009 at 1:52 PM, Anil Cherian cherian.anil2...@gmail.com wrote: Hi David, I just now tried a sorting on the

Re: Solr 1.3 query and index perf tank during optimize

2009-11-20 Thread Lance Norskog
And, terms whose documents have been deleted are not purged. So, you can merge all you like and the index will not shrink back completely. Only an optimize will remove the orphan terms. This is important because the orphan terms affect relevance calculations. So you really want to purge them with

Index time boosts, payloads, and long query strings

2009-11-20 Thread Girish Redekar
Hi , I'm relatively new to Solr/Lucene, and am using Solr (and not lucene directly) primarily because I can use it without writing java code (rest of my project is python coded). My application has the following requirements: (a) ability to search over multiple fields, each with different weight