Re: how to suppress result

2008-04-07 Thread Evgeniy Strokin
bject: Re: how to suppress result Hi Evgeniy +) delete the documents if you really don't need need them +) create a field "ignored" and build an appropriate query to exclude the documents where 'ignored' is true Cheers, Siegfried Goeschl Evgeniy Strokin wrote: >

how to suppress result

2008-04-07 Thread Evgeniy Strokin
Hello,.. I have odd problem. I use Solr for regular search, and it works fine for my task, but my client has a list of IDs in a flat separate file (he could have huge amount of the IDs, up to 1M) and he wants to exclude those IDs from result of the search. What is the right way to do this? Any t

Is number of stored fields affects query performance?

2008-03-31 Thread Evgeniy Strokin
I have two questions related to the subject: 1. If I have 100 fields in my document, all indexed. Will my queries run slower if I store all 100 fields or just 10? 2. If I have 100 fields in my documents, all stored. Will my queries run slower if I index all 100 fields or just 10? Thanks in

Re: Survey: How do you store your fields?

2008-03-21 Thread Evgeniy Strokin
We store all needed fields in Solr, but we have only 20 stored fields out of 100+ indexed. Our requirements is to show 20 fields after searching, and when clients are happy with the result (usually after several searches), we append all others from DB. Of course it takes a while, because our DB

Re: Does emty fields affect index size?

2008-03-20 Thread Evgeniy Strokin
should only be marginally bigger. -Yonik On Thu, Mar 20, 2008 at 3:20 PM, Evgeniy Strokin <[EMAIL PROTECTED]> wrote: > Hello, lets say I have 10 fields and usually some 5 of them are present in > each document. And the size of my index is 100Mb. > I want to change my schema and

Re: Does emty fields affect index size?

2008-03-20 Thread Evgeniy Strokin
fields affect index size? Make sure you omit norms for those fields if possible. If you do that, the index should only be marginally bigger. -Yonik On Thu, Mar 20, 2008 at 3:20 PM, Evgeniy Strokin <[EMAIL PROTECTED]> wrote: > Hello, lets say I have 10 fields and usually some 5 of them

Random search result

2008-03-04 Thread Evgeniy Strokin
I want to get sample from my search result. Not first 10 but 10 random (really random, not pseudo random) documents. For example if I run simple query like STATE:NJ no order by any field, just the query and get 10 first documents from my result set, will it be random 10 or pseudo random, like fi

Re: Threads in Solr

2008-02-26 Thread Evgeniy Strokin
I'm running my tests on server with 4 double-kernel CPU. I was expecting good improvements from multithreaded solution but I have speed 10th times worse. Here is how I run those threads, I think I'm doing something wrong, please advise: -- .

Shared index base

2008-02-26 Thread Evgeniy Strokin
I know there was such discussions about the subject, but I want to ask again if somebody could share more information. We are planning to have several separate servers for our search engine. One of them will be index/search server, and all others are search only. We want to use SAN (BTW: should w

Re: Threads in Solr

2008-02-25 Thread Evgeniy Strokin
Yes I do computing the same DocSet. Should it be the problem? Is any way to solve it? In general in each thread I ran the same query and add different Filter Query. - Original Message From: Chris Hostetter <[EMAIL PROTECTED]> To: Solr User Sent: Monday, February 25, 2008 2:19:02 AM S

Threads in Solr

2008-02-20 Thread Evgeniy Strokin
Hello, I'm overwriting getFacetInfo(...) method from standard request handler (BTW: thanks for making a separate method for faceting :-)) What I need is to ran original query several times with filter query which I generate based on result from original query. Bellow is part of my code. I was thi

Filter Query

2008-02-12 Thread Evgeniy Strokin
Hello,.. Lets say I have one query like this: NAME:Smith I need to restrict the result and I'm doing this: NAME:Smith AND AGE:30 Also, I can do this using fq parameter: q=NAME:Smith&fq=AGE:30 The result of second and third queries should be the same, right? But why should I use fq then? In which ca

Re: 2D Facet

2008-02-12 Thread evgeniy . strokin
Chris, I'm very interested to implement generic multidimensional faceting. But I'm not an expert in Solr, but I'm very good with Java. So I need little bit more directions if you don't mind. I promise to share my code and if you'll be Ok with it you are welcome to use it. So, Lets say I have a p

Re: Big number of conditions of the search

2008-01-17 Thread evgeniy . strokin
and make sure the JVM has plenty of memory. But again, this is best done in RDBMS with some count(*) and GROUP BY selects. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Evgeniy Strokin <[EMAIL PROTECTED]> To: Solr User Sent: Thursda

Re: Big number of conditions of the search

2008-01-17 Thread evgeniy . strokin
to manually increase the max number of clauses allowed (in one of the configs) and make sure the JVM has plenty of memory. But again, this is best done in RDBMS with some count(*) and GROUP BY selects. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Messa

Re: Big number of conditions of the search

2008-01-16 Thread evgeniy . strokin
ne in RDBMS with some count(*) and GROUP BY selects. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message ---- From: Evgeniy Strokin <[EMAIL PROTECTED]> To: Solr User Sent: Thursday, January 10, 2008 4:39:44 PM Subject: Big number of conditions of the

Re: Cache size and Heap size

2008-01-16 Thread evgeniy . strokin
server initialization: -Xmx1536m -Xms1536m Which app server / servlet container are you using? Regards, Daniel Alheiros On 16/1/08 15:23, "Evgeniy Strokin" <[EMAIL PROTECTED]> wrote: > Hello,.. > I have relatively large RAM (10Gb) on my server which is running Solr. I > increas

Cache size and Heap size

2008-01-16 Thread Evgeniy Strokin
Hello,.. I have relatively large RAM (10Gb) on my server which is running Solr. I increased Cache settings and start to see OutOfMemory exceptions, specially on facet search. Is anybody has some suggestions how Cache settings related to Memory consumptions? What are optimal settings? How they c

unique ID question

2008-01-14 Thread Evgeniy Strokin
If I make one of my field as a unique ID, id doesn't increase/decrease performance of searching by this field. Right? For example if I have two fields, I know for sure both of them are unique, both the same type, and make one of them as a Solr Unique ID. The general performance should be the sam

Documents with One-to-many

2008-01-11 Thread Evgeniy Strokin
Hello. If I need documents which has number of fields but also I have number of other documents which related to the first one one-to-many. For example a person, could have several addresses. I want to have all of them in search result if I look for people. Also I want to search people by addres

2D Facet

2008-01-11 Thread Evgeniy Strokin
Hello, is this possible to do in one query: I have a query which returns 1000 documents with names and addresses. I can run facet on state field and see how many addresses I have in each state. But also I need to see how many families lives in each state. So as a result I need a matrix of states

Big number of conditions of the search

2008-01-10 Thread Evgeniy Strokin
Hello, I don't know how to formulate this right, I'll give an example: I have 20 millions documents with unique ID indexed. I have list of IDs stored somewhere. I need to run query which will take documents with ID from my list and gives me some statistic. For example: my documents are addresses

solr with hadoop

2008-01-04 Thread Evgeniy Strokin
I have huge index base (about 110 millions documents, 100 fields each). But size of the index base is reasonable, it's about 70 Gb. All I need is increase performance, since some queries, which match big number of documents, are running slow. So I was thinking is any benefits to use hadoop for t

Re: Cache use

2007-12-04 Thread evgeniy . strokin
es are sub-second. Dennis Kubes Evgeniy Strokin wrote: > Hello,... > we have 110M records index under Solr. Some queries takes a while, but we > need sub-second results. I guess the only solution is cache (something > else?)... > We use standard LRUCache. In docs it says (as far

Re: Cache use

2007-12-04 Thread evgeniy . strokin
mber 4, 2007 2:33:24 PM Subject: Re: Cache use On 4-Dec-07, at 8:43 AM, Evgeniy Strokin wrote: > Hello,... > we have 110M records index under Solr. Some queries takes a while, > but we need sub-second results. I guess the only solution is cache > (something else?)... > We u

Cache use

2007-12-04 Thread Evgeniy Strokin
Hello,... we have 110M records index under Solr. Some queries takes a while, but we need sub-second results. I guess the only solution is cache (something else?)... We use standard LRUCache. In docs it says (as far as I understood) that it loads view of index in to memory and next time works with

How much disc space Solr consumes?

2007-11-29 Thread Evgeniy Strokin
Hello,.. If index size is 100Gb and I want to run optimize command, how much more space I need for this? Also,.. If I run snapshooter does it take more space during shooting than actual snapshoot? \Thank you Gene

Re: Document update based on ID

2007-11-26 Thread evgeniy . strokin
ubject: Re: Document update based on ID Evgeniy Strokin wrote: > Hello,.. > I have a document indexed with Solr. Originally it had only few fields. I > want to add some more fields to the index later, based on ID but I don't want > to submit original fields again. I use Solr 1.2,

Document update based on ID

2007-11-21 Thread Evgeniy Strokin
Hello,.. I have a document indexed with Solr. Originally it had only few fields. I want to add some more fields to the index later, based on ID but I don't want to submit original fields again. I use Solr 1.2, but I think there is no such functionality yet. But I saw a feature here https://iss