DIH import out of memory problem (batchSize and autoCommit not working)

2009-09-22 Thread Steve Sun
Hi, I spent a whole day trying to make batchSize work for JdbcDataSource with org.postgresql.Driver, but got frustrated. At last I took a look into DIH's source code and found that there's actually a bug in there. When JDBC driver is placed in solr-home/lib (as instructed by DIHQuickStart page

Re: DIH import out of memory problem (batchSize and autoCommit not working)

2009-09-22 Thread Shalin Shekhar Mangar
On Tue, Sep 22, 2009 at 2:29 PM, Steve Sun st...@anobii.com wrote: Hi, I spent a whole day trying to make batchSize work for JdbcDataSource with org.postgresql.Driver, but got frustrated. At last I took a look into DIH's source code and found that there's actually a bug in there. When JDBC

Re: DIH import out of memory problem (batchSize and autoCommit not working)

2009-09-22 Thread Steve Sun
2009/9/22 Shalin Shekhar Mangar shalinman...@gmail.com On Tue, Sep 22, 2009 at 2:29 PM, Steve Sun st...@anobii.com wrote: Hi, I spent a whole day trying to make batchSize work for JdbcDataSource with org.postgresql.Driver, but got frustrated. At last I took a look into DIH's source

Apache Hadoop Get Together: Next week Tuesday, newthinking store Berlin Germany

2009-09-22 Thread Isabel Drost
This is a friendly reminder that the next Apache Hadoop Get Together takes place next week on Tuesday, 29th of September* at newthinking store (Tucholskystr. 48, Berlin): http://upcoming.yahoo.com/event/4314020/ * Thorsten Schuett, Solving Puzzles with MapReduce. * Thilo Götz, Text

Query performance

2009-09-22 Thread Gargate, Siddharth
Hi all, Does the following query has any performance impact over the second query? +title:lucene +(title:lucene -name:sid) +(title:lucene -name:sid)

Re: DIH import out of memory problem (batchSize and autoCommit not working)

2009-09-22 Thread Shalin Shekhar Mangar
On Tue, Sep 22, 2009 at 3:00 PM, Steve Sun st...@anobii.com wrote: Done. http://issues.apache.org/jira/browse/SOLR-1450 This is fixed in trunk now. Thanks Steve! -- Regards, Shalin Shekhar Mangar.

solr caching problem

2009-09-22 Thread satyasundar jena
I configured filter cache in solrconfig.xml as here under : filterCache class=solr.FastLRUCache size=16384 initialSize=4096 autowarmCount=4096/ useFilterForSortedQuerytrue/useFilterForSortedQuery as per http://wiki.apache.org/solr/SolrCaching#head-b6a7d51521d55fa0c89f2b576b2659f297f9 And

Re: what is too large for an indexed field

2009-09-22 Thread Erick Erickson
You might also want to get a copy of Luke and examine your index to seewhat's actually in there. Could you be being mislead by, say, punctuation? Erick On Mon, Sep 21, 2009 at 4:28 PM, Yonik Seeley yo...@lucidimagination.comwrote: On Mon, Sep 21, 2009 at 4:22 PM, Park, Michael

Function query result as a filter query

2009-09-22 Thread Pete Smith
Hi, Is it possible to constrain a resultset using a filter query to only return the top 100 documents for a particular field? Say I have a field called 'hits' that has the total number of hits for that item. I want to return only the documents that have the top 100 highest hits. I want

Re: Function query result as a filter query

2009-09-22 Thread Yonik Seeley
It's probably not exactly what you're looking for, but you can do ranges over functions in Solr 1.4 http://www.lucidimagination.com/blog/2009/07/06/ranges-over-functions-in-solr-14/ -Yonik http://www.lucidimagination.com On Tue, Sep 22, 2009 at 10:26 AM, Pete Smith pete.sm...@lovefilm.com

Re: solr caching problem

2009-09-22 Thread Yonik Seeley
Solr's caches should be transparent - they should only speed up queries, not change the result of queries. -Yonik http://www.lucidimagination.com On Tue, Sep 22, 2009 at 9:45 AM, satyasundar jena tosatyaj...@gmail.com wrote: I configured filter cache in solrconfig.xml as here under :

Re: solr caching problem

2009-09-22 Thread satyasundar jena
1)Then do you mean , if we delete a perticular doc ,then that is going to be deleted from cache also. 2)In solr,is cache storing the entire document in memory or only the references to documents in memory. And how to test this caching after all. I ll be thankful upon getting an elaboration.

Oracle incomplete DataImport results

2009-09-22 Thread Daniel Bradley
I appear to be getting only a small number of items imported into Solr when doing a full-import against an oracle data-provider. The query I'm running is something approximately similar to: SELECT ID, dbms_lob.substr(Text, 4000, 1) Text, Date, LastModified, Type, Created, Available, Parent, Title

RE: solr caching problem

2009-09-22 Thread Fuad Efendi
1)Then do you mean , if we delete a perticular doc ,then that is going to be deleted from cache also. When you delete document, and then COMMIT your changes, new caches will be warmed up (and prepopulated by some key-value pairs from old instances), etc: !-- documentCache caches Lucene

Code sync between Lucene and Solr, crossing Apache project boundaries, etc.

2009-09-22 Thread Mark Bennett
To do any serious Solr debugging (or filter development) you also need the Solr source code tree. And you'd like them to be in sync, so that the Lucene code you see is exactly the same as what was used for the Solr version you're working with. I did find this link on sync'ing the two source

Re: Oracle incomplete DataImport results

2009-09-22 Thread Shalin Shekhar Mangar
On Tue, Sep 22, 2009 at 10:53 PM, Daniel Bradley daniel.brad...@adfero.co.uk wrote: I appear to be getting only a small number of items imported into Solr when doing a full-import against an oracle data-provider. The query I'm running is something approximately similar to: SELECT ID,

Re: Code sync between Lucene and Solr, crossing Apache project boundaries, etc.

2009-09-22 Thread Shalin Shekhar Mangar
On Tue, Sep 22, 2009 at 11:24 PM, Mark Bennett mbenn...@ideaeng.com wrote: To do any serious Solr debugging (or filter development) you also need the Solr source code tree. And you'd like them to be in sync, so that the Lucene code you see is exactly the same as what was used for the Solr

No-op query for :q parameter?

2009-09-22 Thread Mat Brown
Hi all, If I have a set of filter queries that I'd like to apply but nothing that I particularly would like to put into the :q parameter (since I'd like all of the scopes to be cached), is there any problem with just passing [* TO *] for the :q param? Any performance implications? Thanks! Mat

Re: No-op query for :q parameter?

2009-09-22 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 12:19 AM, Mat Brown m...@patch.com wrote: If I have a set of filter queries that I'd like to apply but nothing that I particularly would like to put into the :q parameter (since I'd like all of the scopes to be cached), is there any problem with just passing [* TO *]

Re: No-op query for :q parameter?

2009-09-22 Thread Mat Brown
Thanks, Shalin. The *:* sounds good - so that'll definitely have no effect on query performance? What I meant was, I'd like all of the queries that I'm using to restrict search results to be cached (as filter queries are) - which is why I don't have anything I'd particularly like to put into the

Re: No-op query for :q parameter?

2009-09-22 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 12:33 AM, Mat Brown m...@patch.com wrote: Thanks, Shalin. The *:* sounds good - so that'll definitely have no effect on query performance? All query results are added to the query result cache so any cost is one-time only (until a commit happens or eviction happens).

Re: No-op query for :q parameter?

2009-09-22 Thread Mat Brown
Hey Shalin, Thanks for the help. The particular attraction of filter queries is that they are cached separately, and our application takes advantage of that fact, since we often employ several filters in one search - while the combinations of filters are numerous, the individual filters comprise

RE: No-op query for :q parameter?

2009-09-22 Thread Fuad Efendi
is there any problem with just passing [* TO *] for the :q param? Any performance implications? Only if you are using faceting on a field with high cardinality (such as tokenized, multivalued) Additional parameters: how many docs do you retrieve in a single query? 100, 1, ... - lazy

Parallel requests to Tomcat

2009-09-22 Thread Michael
Hi, I have a Solr+Tomcat installation on an 8 CPU Linux box, and I just tried sending parallel requests to it and measuring response time. I would expect that it could handle up to 8 parallel requests without significant slowdown of any individual request. Instead, I found that Tomcat is

Re: Batching requests using SolrCell with SolrJ

2009-09-22 Thread Grant Ingersoll
On Sep 19, 2009, at 1:22 PM, Jay Hill wrote: When working with SolrJ I have typically batched a Collection of SolrInputDocument objects before sending them to the Solr server. I'm working with the latest nightly build and using the ExtractingRequestHandler to index documents, and everything

A little discovery about the solr classpath and jetty

2009-09-22 Thread Benson Margulies
On (at least) two occasions, I've opened JIRAs due to my getting tangled up with eclipse, jetty, and solr/lib. Well, it occurs to me that a recent idea might be of general use to others in this regard. This fragment is offered for illustration. The idea here is that you can configure the jetty

Re: Parallel requests to Tomcat

2009-09-22 Thread Yonik Seeley
What version of Solr are you using? Solr1.3 and Lucene 2.4 defaulted to an index reader implementation that had to synchronize, so search operations that are IO heavy can't proceed in parallel. You shouldn't see this with 1.4 -Yonik http://www.lucidimagination.com On Tue, Sep 22, 2009 at 4:03

returning stored fields

2009-09-22 Thread Eric Lease Morgan
Is there any way to configure in solrconf.xml (or anywhere else) what fields to return by default? I am indexing sets of full text books. My fields include metadata (author, title, publisher, etc.) as well as the full text of the book. Since I want to enable highlighting against the full

Re: returning stored fields

2009-09-22 Thread Mark A. Matienzo
Hi Eric, On Tue, Sep 22, 2009 at 8:41 PM, Eric Lease Morgan eric_mor...@infomotions.com wrote: Is there any way to configure in solrconf.xml (or anywhere else) what fields to return by default? Yes - in one of the requestHandler sections of solrconfig.xml, you can specify defaults for

Re: returning stored fields [resolved]

2009-09-22 Thread Eric Lease Morgan
On Sep 22, 2009, at 8:51 PM, Mark A. Matienzo wrote: Is there any way to configure in solrconf.xml (or anywhere else) what fields to return by default? Yes - in one of the requestHandler sections of solrconfig.xml, you can specify defaults for specific query parameters. For example, you

Re: Code sync between Lucene and Solr, crossing Apache project boundaries, etc.

2009-09-22 Thread Grant Ingersoll
On Sep 22, 2009, at 1:54 PM, Mark Bennett wrote: To do any serious Solr debugging (or filter development) you also need the Solr source code tree. And you'd like them to be in sync, so that the Lucene code you see is exactly the same as what was used for the Solr version you're working

Re: solr caching problem

2009-09-22 Thread satya
First of all , thanks a lot for the clarification.Is there any way to see, how this cache is working internally and what are the objects being stored and how much memory its consuming,so that we can get a clear picture in mind.And how to test the performance through cache. On Tue, Sep 22, 2009 at

How to configure Solr 1.3 on Websphere 6.1

2009-09-22 Thread adnanqureshi
Hi all, Solr 1.3 Websphere 6.1 I have been looking for some documentation on how to configure Solr on Websphere but no luck yet. Can some one suggest me some document which can give me the overview on how to get started with Solr on Websphere and how to integrate it with websites (Java/JSP). I