Request to be added to the ContributorsGroup

2013-07-12 Thread Kumar Limbu
Hi, My username is KumarLImbu and I would like to be added to the Contributors Group. Could somebody please help me? Best Regards, Kumar

Re: Usage of CloudSolrServer?

2013-07-12 Thread Furkan KAMACI
CloudSolrServer uses LBHttpSolrServer by default. CloudSolrServer connects to Zookeeper and passes the live nodes to LBHttpSolrServer. LBHttpSolrServer connects each node as round robin. By the way do you mean leader instead of master? 2013/7/12 sathish_ix skandhasw...@inautix.co.in Hi , Iam

Re: Leader Election, when?

2013-07-12 Thread Furkan KAMACI
If you want to plan to have 2 shards and if you start up the first node it will be the leader of first shard. When you start up second node it will be the leader of second shard. If you start up third node it will be the replica of first shard. If you start up fourth node it will be the replica of

Re: solr 4.3.0 cloud in Tomcat, link many collections to Zookeeper

2013-07-12 Thread Furkan KAMACI
If you have one collection you just need to define hostnames of Zookeeper ensembles and run that command once. 2013/7/11 Zhang, Lisheng lisheng.zh...@broadvision.com Hi, We are testing solr 4.3.0 in Tomcat (considering upgrading solr 3.6.1 to 4.3.0), in WIKI page for solrCloud in Tomcat:

Re: Performance of cross join vs block join

2013-07-12 Thread mihaela olteanu
Hi Mikhail, I have used wrong the term block join. When I said block join I was referring to a join performed on a single core versus cross join which was performed on multiple cores. But I saw your benchmark (from cache) and it seems that block join has better performance. Is this

Solr 4.3 Shard distributed request check probably incorrect?

2013-07-12 Thread Johann Höchtl
Hi, we are using Solr 4.3 with regular sharding without ZooKeeper. I see the following errors inside our logs: 14995742 [qtp427093680-2249] INFO org.apache.solr.core.SolrCore - [DE1] webapp=/solr path=/select

About Suggestions

2013-07-12 Thread Lochschmied, Alexander
Hi Solr people! We need to suggest part numbers in alphabetically order adding up to four characters to the already entered part number prefix. That works quite well with terms component acting on a multivalued field with keyword tokenizer and edge nGram filter. I am mentioning part numbers to

Re: Performance of cross join vs block join

2013-07-12 Thread Mikhail Khludnev
On Fri, Jul 12, 2013 at 12:19 PM, mihaela olteanu mihaela...@yahoo.comwrote: Hi Mikhail, I have used wrong the term block join. When I said block join I was referring to a join performed on a single core versus cross join which was performed on multiple cores. But I saw your benchmark (from

Search with punctuations

2013-07-12 Thread kobe.free.wo...@gmail.com
Hi, Scenario: User who perform search forget to put punctuation mark (apostrophe) for ex, when user wants to search for a value like INT'L, they just key in INTL (with no punctuation). In this scenario, I wish to return both values with INTL and INT'L that currently are indexed on SOLR

Re: How to set a condition on the number of docs found

2013-07-12 Thread Furkan KAMACI
Do you want to modify Solr source code? Did you check that line at XMLWriter.java : *writeAttr(numFound,Long.toString(numFound));* 2013/7/12 Matt Lieber mlie...@impetus.com Hello there, I would like to be able to know whether I got over a certain threshold of doc results. I.e. Test

Re: Solr Live Nodes not updating immediately

2013-07-12 Thread Ranjith Venkatesan
Hi, tickTime in zookeeper was high. When i reduced it to 2000ms solr node status gets updated in 20s. Hence resolved my issue. Thanks for helping me. I have one more question. 1. Is it advisable to reduce the tickTime further. 2. Or whats the most appropriate tickTime which gives maximum

Custom processing in Solr Request Handler plugin and its debugging ?

2013-07-12 Thread Tony Mullins
Hi, I have defined my new Solr RequestHandler plugin like this in SolrConfig.xml requestHandler name=/myendpoint class=com.abc.MyRequestPlugin /requestHandler And its working fine. Now I want to do some custom processing from my this plugin by making a search query to regular '/select'

SolrCloud group.query error shard X did not set sort field values or how i can set fillFields=true on IndexSearcher.search

2013-07-12 Thread Evgeny Salnikov
Hi! To repeat the problem, do the following 1. Start a node1 of SolrCloud (4.3.1 default configs) (java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -jar start.jar) 2. Import to collection1 - shard1 some data 3. Try group.query e.g.

How to optimize a search?

2013-07-12 Thread padcoe
Hello folks, I'm doing a search for a specific word (Rocket Banana) in a specific field and the document with the result Rocket Banana (Single) never comes first..and this is the result that should appear in first position...i've tried to many ways to perform this search: title:Rocket Banana

Re: How to optimize a search?

2013-07-12 Thread Erick Erickson
_Why_ should Rocket Banana (Single) come first? Essentially you have some ordering in mind and unless you can express it clearly you'll _never_ get ideal ranking. Really. But your particular issue can probably be solved by adding a clause like OR rocket banana^5 And I suspect you haven't given

Patch review request: SOLR-5001 (adding book links to the website)

2013-07-12 Thread Alexandre Rafalovitch
Hello, As per earlier email thread, I have created a patch for Solr website to incorporate links to my new book. It would be nice if somebody with commit rights for the (markdown) website could look at it before the book's Solr version (4.3.1) stops being the latest :-) I promise to help with

Does Solrj Batch Processing Querying May Confuse?

2013-07-12 Thread Furkan KAMACI
I've crawled some webpages and indexed them at Solr. I've queried data at Solr via Solrj. url is my unique field and I've define my query as like that: ModifiableSolrParams params = new ModifiableSolrParams(); params.set(q, lang:tr); params.set(fl, url); params.set(sort, url desc); I've run my

Re: Problem using Term Component in solr

2013-07-12 Thread Erick Erickson
bq: Note:Term Component works only on string dataType field. :( Not true. Term Component will work on any indexed field. It'll bring back the _tokens_ that have been indexed though, which are often individual words so your examples medical physics would be two separate tokens so it may be

Re: Solr caching clarifications

2013-07-12 Thread Erick Erickson
Inline On Thu, Jul 11, 2013 at 8:36 AM, Manuel Le Normand manuel.lenorm...@gmail.com wrote: Hello, As a result of frequent java OOM exceptions, I try to investigate more into the solr jvm memory heap usage. Please correct me if I am mistaking, this is my understanding of usages for the heap

RE: What happens in indexing request in solr cloud if Zookeepers are all dead?

2013-07-12 Thread Zhang, Lisheng
Thanks very much for your clear explanation! -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Thursday, July 11, 2013 1:55 PM To: solr-user@lucene.apache.org Subject: Re: What happens in indexing request in solr cloud if Zookeepers are all dead? Sorry, no

Re: How to boost relevance based on distance and age..

2013-07-12 Thread Erick Erickson
the first thing I'd try would be FunctionQueries, see: http://wiki.apache.org/solr/FunctionQuery. Be a little careful. You have disjoint conditions, i.e. one or the other should be used so you'll have two function queries, basically expressing if (age 20 years) if (age = 20 years) The one that

RE: solr 4.3.0 cloud in Tomcat, link many collections to Zookeeper

2013-07-12 Thread Zhang, Lisheng
Sorry I might not have asked clearly, our issue is that we have a few thousand collections (can be much more), so running that command is rather tedius, is there a simpler way (all collections share same schema/config)? Thanks very much for helps, Lisheng -Original Message- From:

Re: Request to be added to the ContributorsGroup

2013-07-12 Thread Erick Erickson
Done, at least to the Solr contributor's group, if you want Lucene, let me know. Added exactly as KumarLImbu, don't know whether 1 both the L and I should be capitalized 2 whether the rights-checking cares. Thanks! Erick On Fri, Jul 12, 2013 at 2:51 AM, Kumar Limbu kumarli...@gmail.com wrote:

Re: Leader Election, when?

2013-07-12 Thread Erick Erickson
This is probably not all that important to worry about. The additional duties of a leader are pretty minimal. And the leaders will shift around anyway as you restart servers etc. Really feels like a premature optimization. Best Erick On Thu, Jul 11, 2013 at 3:53 PM, aabreur

Re: Solr Live Nodes not updating immediately

2013-07-12 Thread Shawn Heisey
On 7/11/2013 11:11 PM, Ranjith Venkatesan wrote: tickTime in zookeeper was high. When i reduced it to 2000ms solr node status gets updated in 20s. Hence resolved my issue. Thanks for helping me. I have one more question. 1. Is it advisable to reduce the tickTime further. 2. Or whats the

Re: solr 4.3.0 cloud in Tomcat, link many collections to Zookeeper

2013-07-12 Thread Shawn Heisey
On 7/12/2013 7:29 AM, Zhang, Lisheng wrote: Sorry I might not have asked clearly, our issue is that we have a few thousand collections (can be much more), so running that command is rather tedius, is there a simpler way (all collections share same schema/config)? When you create each

Re: How to set a condition on the number of docs found

2013-07-12 Thread William Bell
Hmmm. One way is: http://localhost:8983/solr/core/select/?q=*%3A*facet=truefacet.field=idfacet.offset=10rows=0facet.limit=1http://hgsolr2devmstr:8983/solr/providersearch/select/?q=*%3A*facet=truefacet.field=cityfacet.offset=10rows=0facet.limit=1 If you have a result you have results 10.

Re: How to set a condition on the number of docs found

2013-07-12 Thread Jack Krupansky
Test where? I mean, numFound is right there at the top of the query results, right? Unfortunately there is no function query value source equivalent to numFound. There is numdocs, but that is the total documents in the index. There is also docfreq(term), which could be used in a function

Re: Norms

2013-07-12 Thread William Bell
Thanks. Yeah I don't really want the queryNorm on On Wed, Jul 10, 2013 at 2:39 AM, Daniel Collins danwcoll...@gmail.comwrote: I don't know the full answer to your question, but here's what I can offer. Solr offers 2 types of normalisation, FieldNorm and QueryNorm. FieldNorm is as the

Re: Is it possible to find a leader from a list of cores in solr via java code

2013-07-12 Thread vicky desai
Hi, As per the suggestions above I shifted my focus to using CloudSolrServer. In terms of sending updates to the leaders and reducing network traffic it works great. But i faced one problem in using CloudSolrServer is that it opens too many connections as large as five thousand connections. My

Re: What does too many merges...stalling in indexwriter log mean?

2013-07-12 Thread Tom Burton-West
Thanks Shawn, Do you have any feeling for what gets traded off if we increase the maxMergeCount? This is completely new for us because we are experimenting with indexing pages instead of whole documents. Since our average document is about 370 pages, this means that we have increased the number

Multiple queries or Filtering Queries in Solr

2013-07-12 Thread dcode
My problem is I have n fields (say around 10) in Solr that are searchable, they all are indexed and stored. I would like to run a query first on my whole index of say 5000 docs which will hit around an average of 500 docs. Next I would like to query using a different set of keywords on these 500

Re: How to set a condition over stats result

2013-07-12 Thread Jack Krupansky
sum(x, y, z) = x + y + z (sums those specific fields values for the current document) sum(x, y) = x + y (sum of those two specific field values for the current document) sum(x) = field(x) = x (the specific field value for the current document) The sum function in function queries is not an

Re: What does too many merges...stalling in indexwriter log mean?

2013-07-12 Thread Shawn Heisey
On 7/12/2013 9:23 AM, Tom Burton-West wrote: Do you have any feeling for what gets traded off if we increase the maxMergeCount? This is completely new for us because we are experimenting with indexing pages instead of whole documents. Since our average document is about 370 pages, this

RE: solr 4.3.0 cloud in Tomcat, link many collections to Zookeeper

2013-07-12 Thread Zhang, Lisheng
Thanks very much for all the helps! -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: Friday, July 12, 2013 7:31 AM To: solr-user@lucene.apache.org Subject: Re: solr 4.3.0 cloud in Tomcat, link many collections to Zookeeper On 7/12/2013 7:29 AM, Zhang, Lisheng

Re: Patch review request: SOLR-5001 (adding book links to the website)

2013-07-12 Thread Steve Rowe
Hi Alexandre, I'll work on this today. Steve On Jul 12, 2013, at 8:26 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: Hello, As per earlier email thread, I have created a patch for Solr website to incorporate links to my new book. It would be nice if somebody with commit rights for

Re: Performance of cross join vs block join

2013-07-12 Thread Roman Chyla
Hi Mikhail, I have commented on your blog, but it seems I have done st wrong, as the comment is not there. Would it be possible to share the test setup (script)? I have found out that the crucial thing with joins is the number of 'joins' [hits returned] and it seems that the experiments I have

Re: Problem using Term Component in solr

2013-07-12 Thread Parul Gupta(Knimbus)
Hi, Ok I will not use Bold text in my queries I guess my question is not clear to you See what I am doing is, i have a live source say 'A' and a stored database say it as 'B'.ok A and B ,both have title fields in them.Consider A as non-persistent solr and B as persistent solr. I have

RE: expunging deletes

2013-07-12 Thread Petersen, Robert
OK Thanks Shawn, I went with this because 10 wasn't working for us and it looks like my index is staying under 20 GB now with numDocs : 16897524 and maxDoc : 19048053 mergePolicy class=org.apache.lucene.index.TieredMergePolicy int name=maxMergeAtOnce5/int int

Re: How to set a condition on the number of docs found

2013-07-12 Thread Matt Lieber
Thanks William, I'll do that. Matt On 7/12/13 7:38 AM, William Bell billnb...@gmail.com wrote: Hmmm. One way is: http://localhost:8983/solr/core/select/?q=*%3A*facet=truefacet.field=id; facet.offset=10rows=0facet.limit=1http://hgsolr2devmstr:8983/solr/provi

Save Solr index in database

2013-07-12 Thread sagarmj76
hi I wanted to understand if it is possible to store/save Solr indexes to the database instead of the filesystem. I checked out some articles where lucene can do it. Hence I assume Solr can too but its not clear to me how to configure Solr to save the indexes in the database instead in the /index

Re: Performance of cross join vs block join

2013-07-12 Thread Mikhail Khludnev
Hello Roman, Thanks for your interest. I briefly looked on your approach, and I'm really interested in your numbers. Here is the trivial code, I'd rather prefer rely on your testing framework, and can provide you a version of Solr 4.2 with SOLR-3076 applied. Do you need it?

Re: Save Solr index in database

2013-07-12 Thread Alexandre Rafalovitch
And why would you want to do that? Seems rather wrong direction to march in. I am assuming relational database. There is a commercial solution that integrates Solr into Cassandra, if I understood it correctly: http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-solr

Re: Save Solr index in database

2013-07-12 Thread Shawn Heisey
On 7/12/2013 12:30 PM, sagarmj76 wrote: hi I wanted to understand if it is possible to store/save Solr indexes to the database instead of the filesystem. I checked out some articles where lucene can do it. Hence I assume Solr can too but its not clear to me how to configure Solr to save the

Re: Save Solr index in database

2013-07-12 Thread Sagar Jadhav
The reason for going that route is because our application is clustered and if the indexing information is on the filesystem, I am not sure whether that would be replicated. At the same time since its a product it needs to be packaged with the product and also from a proprietary reason we are not

Re: Save Solr index in database

2013-07-12 Thread Gora Mohanty
On 13 July 2013 00:19, Shawn Heisey s...@elyograg.org wrote: On 7/12/2013 12:30 PM, sagarmj76 wrote: hi I wanted to understand if it is possible to store/save Solr indexes to the database instead of the filesystem. I checked out some articles where lucene can do it. Hence I assume Solr can

Re: Save Solr index in database

2013-07-12 Thread Shawn Heisey
On 7/12/2013 12:51 PM, Sagar Jadhav wrote: The reason for going that route is because our application is clustered and if the indexing information is on the filesystem, I am not sure whether that would be replicated. At the same time since its a product it needs to be packaged with the product

Re: Save Solr index in database

2013-07-12 Thread Sagar Jadhav
I think that makes a lot of sense as I was reading the Solr Cloud technique. Thanks a lot Shawn for the validation. Thanks a lot everyone for helping me out to go in the right direction. I really appreciate all the inputs. I will now go back and get the exception for getting access to the

Re: Save Solr index in database

2013-07-12 Thread Upayavira
If they ask, tell them that Solr *is* a database. Databases store their stuff on a file system, so your data is gonna end up there in the end. Putting Solr indexes inside a database is like storing mysql tables in Oracle. Upayavira On Fri, Jul 12, 2013, at 08:18 PM, Sagar Jadhav wrote: I think

Re: Problem using Term Component in solr

2013-07-12 Thread Erick Erickson
Is the vocabulary known? That is, do you know the abbreviations that will be used? If so, you could consider synonyms, in which case you'd go to tokenized titles and use phrase queries to get your matches... Regexes often don't scale extremely well, although the 4.x FST implementations are much

solr autodetectparser tikaconfig dataimporter error

2013-07-12 Thread Andreas Owen
i am using solr 3.5, tika-app-1.4 and tagcloud 1.2.1. when i try to = import a file via xml i get this error, it doesn't matter what file format i try = to index txt, cfm, pdf all the same error: SEVERE: Exception while processing: rec document : SolrInputDocument[{id=3Did(1.0)=3D{myTest.txt},

add to ContributorsGroup

2013-07-12 Thread Ken Geis
Hi. Could you add me (KenGeis) to the Solr Wiki ContributorsGroup? I'd like to fix some typos. Thanks, Ken Geis

Re: add to ContributorsGroup

2013-07-12 Thread Erick Erickson
Done, Thanks for helping! Erick On Fri, Jul 12, 2013 at 4:30 PM, Ken Geis kg...@speakeasy.net wrote: Hi. Could you add me (KenGeis) to the Solr Wiki ContributorsGroup? I'd like to fix some typos. Thanks, Ken Geis

add to ContributorsGroup - Instructions for setting up SolrCloud on jboss

2013-07-12 Thread Ali, Saqib
Hello, Can you please add me to the ContributorsGroup? I would like to add instructions for setting up SolrCloud using Jboss. thanks.

Re: add to ContributorsGroup - Instructions for setting up SolrCloud on jboss

2013-07-12 Thread Ali, Saqib
username: saqib On Fri, Jul 12, 2013 at 2:35 PM, Ali, Saqib docbook@gmail.com wrote: Hello, Can you please add me to the ContributorsGroup? I would like to add instructions for setting up SolrCloud using Jboss. thanks.

Re: Norms

2013-07-12 Thread Lance Norskog
Norms stay in the index even if you delete all of the data. If you just changed the schema, emptied the index, and tested again, you've still got norms in there. You can examine the index with Luke to verify this. On 07/09/2013 08:57 PM, William Bell wrote: I have a field that has

java.lang.OutOfMemoryError: Requested array size exceeds VM limit

2013-07-12 Thread Ali, Saqib
I am getting a java.lang.OutOfMemoryError: Requested array size exceeds VM limit on certain queries. Please advise: 19:25:02,632 INFO [org.apache.solr.core.SolrCore] (http-oktst1509.company.tld/12.5.105.96:8180-9) [collection1] webapp=/solr path=/select

zero-valued retrieval scores

2013-07-12 Thread Joe Zhang
when I search a keyword (such as apple), most of the docs carry 0.0 as score. Here is an example from explain: str name= http://www.bloomberg.com/slideshow/2013-07-12/world-at-work-india.html; 0.0 = (MATCH) fieldWeight(content:appl in 51), product of: 1.0 = tf(termFreq(content:appl)=1)

Re: zero-valued retrieval scores

2013-07-12 Thread Jack Krupansky
Did you put a boost of 0.0 on the documents, as opposed to the default of 1.0? x * 0.0 = 0.0 -- Jack Krupansky -Original Message- From: Joe Zhang Sent: Friday, July 12, 2013 10:31 PM To: solr-user@lucene.apache.org Subject: zero-valued retrieval scores when I search a keyword (such

Re: zero-valued retrieval scores

2013-07-12 Thread Joe Zhang
Yes, you are right, the boost on these documents are 0. I didn't provide them, though. I suppose the boost scores come from Nutch (yes, my solr indexes crawled web docs). What could be wrong? again, what exactly is the formula for fieldNorm? On Fri, Jul 12, 2013 at 8:46 PM, Jack Krupansky

Re: zero-valued retrieval scores

2013-07-12 Thread Jack Krupansky
For the calculation of norm, see note number 6: http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html You would need to talk to the Nutch guys to see why THEY are setting document boost to 0.0. -- Jack Krupansky -Original Message- From:

Re: zero-valued retrieval scores

2013-07-12 Thread Joe Zhang
Thanks, Jack! On Fri, Jul 12, 2013 at 9:37 PM, Jack Krupansky j...@basetechnology.comwrote: For the calculation of norm, see note number 6: http://lucene.apache.org/core/**4_3_0/core/org/apache/lucene/**