Re: IndexWrite in Lucene/Solr 3.5 is slower?

2012-06-14 Thread pravesh
BTW, Have you changed the MergePolicy & MergeScheduler settings also? Since Lucene 3.x/3.5 onwards, there have been new MergePolicy & MergeScheduler implementations available, like TieredMergePolicy & ConcurrentMergeScheduler. Regards Pravesh -- View this message in context: http://lucene.472066

RE: Starts with Query

2012-06-14 Thread Afroz Ahmad
If you are not searching for the specific digit and want to match all documents that start with any digit, you could as part of the indexing process, have another field say startsWithDigit and set it to true if it the title begins with a digit. All you need to do at query time then is query for sta

IndexWrite in Lucene/Solr 3.5 is slower?

2012-06-14 Thread Ramprakash Ramamoorthy
We are upgrading our search infrastructure from Lucene 2.3.1 to Lucene 3.5. I am in the process of load testing and I could find that Lucene 2.3.1 could index 32,000 docs per second, whereas Lucene 3.5 could index only around 17,000 docs per second. Indeed, both of them use the standard analyzer a

Re: Starts with Query

2012-06-14 Thread nutchsolruser
Thanks Jack for valuable response,Actually i am trying to match *any* numeric pattern at the start of each document. I dont know documents in index i just want documents title starting with any digit. -- View this message in context: http://lucene.472066.n3.nabble.com/Starts-with-Query-tp3989627

Re: FilterCache - maximum size of document set

2012-06-14 Thread Pawel Rog
It can be true that filters cache max size is set to high value. That is also true that. We looked at evictions and hit rate earlier. Maybe you are right that evictions are not always unwanted. Some time ago we made tests. There are not so high difference in hit rate when filters maxSize is set to

Re: PageRanking with DIH

2012-06-14 Thread Chris Hostetter
: I have computed pagerank offline for document set dump. I ideally : want to use pagerank and solr relevency score together in formula to : sort search solr result. I have already looked at : http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_increase_the_score_for_specific_documents : and

Re: Regarding number of documents

2012-06-14 Thread Swetha Shenoy
Thanks all, for your inputs. We found what the problem was, the reason certain entries were missing from the index and not from the MySQL search results was that we had some customized transformers in the data config, that skipped the entries when a particular field was missing. On Thu, Jun 14, 2

Re: defaultSearchField not working after upgrade to solr3.6

2012-06-14 Thread Jack Krupansky
Hmmm... how could I have gotten so confused?!?! Actually, I recognized my mistake yesterday (after reading the code some more for David's Jira) but hadn't gotten around to correcting myself. In any case, the original problematic scenario may have been simply copying 3.5 request handler/params

Re: How to boost a field with another field's value?

2012-06-14 Thread Jack Krupansky
See "Function Query": http://wiki.apache.org/solr/FunctionQuery If you are using the dismax or edismax query parser you can use the "bf" request parameter. e.g., q=foo&bf="ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3" -- Jack Krupansky -Original Message- From: smita S

How to boost a field with another field's value?

2012-06-14 Thread smita
I have 2 fields in my schema - e.g. long field "field1" and long field "field 2". I'd like my boost query to be such that field1 is boosted by the value of field 2 for each document. What should the query time boost for this look like? I was able to do this using Index time boosting with the Data

Re: defaultSearchField and param df are messed up in 3.6.x

2012-06-14 Thread Chris Hostetter
: So if defaultSearchField has been removed (deprecated) from schema.xml then why : are the still calls to "org.apache.solr.schema.IndexSchema.getDefaultSearchFieldName()"? Because even though the syntax is deprecated/discouraged in schema.xml, we don't want things to break for existing users

Re: defaultSearchField not working after upgrade to solr3.6

2012-06-14 Thread Chris Hostetter
: Correct. In 3.6 it is simply ignored. In 4.x it currently does work. That's not true. the example cofigs in Solr 3.6 no longer mention defaultSearchField, but Solr 3.6 will still respect a declaration if it exists in your schema.xml -- I just verified this by running Solr 3.6 using hte Sol

Re: solrj library requirements: slf4j-jdk14-1.5.5.jar

2012-06-14 Thread Sami Siren
What is the version of solrj you are trying to get working? If you download version 3.6 of solr there's a directory dist/solrj-lib in the binary release artifact that includes the required dependencies. I would start with those. -- Sami Siren On Wed, Jun 6, 2012 at 5:34 PM, Welty, Richard wrot

Re: Regarding number of documents

2012-06-14 Thread Erick Erickson
Here's a quick thing to check. Delete your index and do a fresh import. Then go to the admin/statistics. Check the "numDocs" and "maxDocs" entries. If they're different, it means that some of your documents have been deleted. Deleted you say? What's that about? Well, if more than one record has th

Re: FilterCache - maximum size of document set

2012-06-14 Thread Erick Erickson
Hmmm, your maxSize is pretty high, it may just be that you've set this much higher than is wise. The maxSize setting governs the number of entries. I'd start with a much lower number here, and monitor the solr/admin page for both hit ratio and evictions. Well, and size too. 16,000 entries puts a ce

phrase query and string/keyword tokenizer

2012-06-14 Thread Cat Bieber
I have documents that are word definitions (basically an online dictionary) that can have alternate titles. For example the document entitled "Read-only memory" might have an alternate title of "ROM". In search results, I want to boost documents with an alternate title that is a case-insensitiv

Re: Regarding number of documents

2012-06-14 Thread Swetha Shenoy
I am running a full-import. DIH reported that 1125 documents were added after indexing. This number did not change even after I added the new entries. How do I check the ID for an entry and query it against Solr? On Wed, Jun 13, 2012 at 10:33 PM, Gora Mohanty wrote: > On 14 June 2012 04:51, Swe

Re: LockObtainFailedException after trying to create cores on second SolrCloud instance

2012-06-14 Thread Daniel Brügge
Aha, OK. That was new to me. Will check this. Thanks. On Thu, Jun 14, 2012 at 3:52 PM, Yury Kats wrote: > On 6/14/2012 2:05 AM, Daniel Brügge wrote: > > Will check later to use different data dirs for the core on > > each instance. > > But because each Solr sits in it's own openvz instance (virt

Re: DIH idle in transaction forever

2012-06-14 Thread Jasper Floor
Actually, the readOnly=true makes things worse. What it does (among other things) is: c.setTransactionIsolation(Connection.TRANSACTION_READ_UNCOMMITTED); which leads to: Caused by: org.postgresql.util.PSQLException: Cannot change transaction isolation level in the middle of a transacti

RE: DIH idle in transaction forever

2012-06-14 Thread Dyer, James
Try readOnly="true" in the dataSource configuration. This causes several defaults to get set in the JDBC connection, and often will solve problems like this. (see http://wiki.apache.org/solr/DataImportHandler#Configuring_JdbcDataSource) Also, try a batch size of 0 to let your jdbc driver pick

Re: Starts with Query

2012-06-14 Thread Jack Krupansky
Are you trying to query for any numeric term at the start of a title or a specific numeric term at the start of a title? Unless you are using a query parser that supports Lucene's SpanFirstQuery or SpanPositionRangeQuery, you have two choices: 1. Explicitly (or implicitly via a custom update

Re: Starts with Query

2012-06-14 Thread Ahmet Arslan
> I want to find documents whose title > is starting with digit, what will be > solr query for this. I have tried many queries but could not > able to > configure proper query for this. > Note : title is a field in my index. Something like this? q=title:(1* 2* 3* 4* ... 9*)&q.op=OR

Re: LockObtainFailedException after trying to create cores on second SolrCloud instance

2012-06-14 Thread Yury Kats
On 6/14/2012 2:05 AM, Daniel Brügge wrote: > Will check later to use different data dirs for the core on > each instance. > But because each Solr sits in it's own openvz instance (virtual > server respectively) they should be totally separated. At least > from my point of understanding virtualizati

DIH idle in transaction forever

2012-06-14 Thread Jasper Floor
Hi all, It seems that DIH always holds two connections open to the database. One of them is almost always 'idle in transaction'. It may sometimes seem to do a little work but then it goes idle again. datasource definition: We have a datasource defined in the jndi:

Starts with Query

2012-06-14 Thread nutchsolruser
I want to find documents whose title is starting with digit, what will be solr query for this. I have tried many queries but could not able to configure proper query for this. Note : title is a field in my index. -- View this message in context: http://lucene.472066.n3.nabble.com/Starts-with-Quer

Re: LockObtainFailedException after trying to create cores on second SolrCloud instance

2012-06-14 Thread Daniel Brügge
OK, I think I have found it. I provided when starting the 4 solr instances via start.jar always the data directory property via *-Dsolr.data.dir=/home/myuser/data * After removing this it worked fine. What is weird is, that all 4 instances are totally separated, so that instance-2 should never con