Solr Replication

2013-03-14 Thread vicky desai
Hi, I am using solr 4 setup. For the backup purpose once in a day I start one additional tomcat server with cores having empty data folders and which acts as a slave server. However it does not replicate data from the master unless there is a commit on the master. Is there a possibility to pull

New-Question On Search data who does not have x field

2013-03-14 Thread anurag.jain
My prev question was I have updated 250 data to solr. and some of data have category field and some of don't have. for example. { id:321, name:anurag, category:30 }, { id:3, name:john } now i want to search that docs who does not have that field. what query should like. I

Re: Solr Replication

2013-03-14 Thread Ahmet Arslan
Hi Vicky, May be str name=replicateAfterstartup/str ? For backups http://master_host:port/solr/replication?command=backup would be more suitable. or str name=backupAfterstartup/str --- On Thu, 3/14/13, vicky desai vicky.de...@germinait.com wrote: From: vicky desai

Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Chantal Ackermann
Hi all, this is not a question. I just wanted to announce that I've written a blog post on how to set up Maven for packaging and automatic testing of a SOLR index configuration. http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/ Feedback or comments

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread David Philip
Informative. Useful.Thanks On Thu, Mar 14, 2013 at 1:59 PM, Chantal Ackermann c.ackerm...@it-agenten.com wrote: Hi all, this is not a question. I just wanted to announce that I've written a blog post on how to set up Maven for packaging and automatic testing of a SOLR index

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Paul Libbrecht
Nice, Chantal can you indicate there or here what kind of speed for integration tests you've reached with this, from a bare source to a successfully tested application? (e.g. with 100 documents) thanks in advance Paul On 14 mars 2013, at 09:29, Chantal Ackermann wrote: Hi all, this

OutOfMemoryError

2013-03-14 Thread Arkadi Colson
Hi I'm getting this error after a few hours of filling solr with documents. Tomcat is running with -Xms1024m -Xmx4096m. Total memory of host is 12GB. Softcommits are done every second and hard commits every minute. Any idea why this is happening and how to avoid this? *top* PID USER

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Chantal Ackermann
Hi Paul, I'm sorry I cannot provide you with any numbers. I also doubt it would be wise to post any as I think the speed depends highly on what you are doing in your integration tests. Say you have several request handlers that you want to test (on different cores), and some more complex use

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Paul Libbrecht
Chantal, the goal is different: get a general feeling how practical it is to integrate this in the routine. If you are able, on your contemporary machine which I assume is not a supercomputer of some special sort, to run this whole process somewhat useful for you in about 2 minutes then I'll

Re: OutOfMemoryError

2013-03-14 Thread Arkadi Colson
When I shutdown tomcat free -m and top keeps telling me the same values. Almost no free memory... Any idea? On 03/14/2013 10:35 AM, Arkadi Colson wrote: Hi I'm getting this error after a few hours of filling solr with documents. Tomcat is running with -Xms1024m -Xmx4096m. Total memory of

Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Rafał Radecki
Hi All. I am monitoring two solr 4.1 solr instances in master-slave setup. On both nodes I check url /solr/replication?command=details and parse it to get: - on master: if replication is enabled - field replicationEnabled - on slave: if replication is enabled - field replicationEnabled - on

Re: New-Question On Search data who does not have x field

2013-03-14 Thread Jack Krupansky
Writing OR - is simply the same as -, so the query would match documents containing category 20 and then remove all documents that had any category (including 20) specified, giving you nothing. Try: http://localhost:8983/search?q=*:*wt=jsonstart=0fq=category:20; OR (*:* -category:[* TO *])

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Rafał Radecki
In the output of: /solr/replication?command=details there is indexVersion mentioned many times: response lst name=responseHeader int name=status0/int int name=QTime3/int /lst lst name=details str name=indexSize22.59 KB/str str name=indexPath/usr/share/solr/data/index//str arr name=commits lst

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-14 Thread Luis Cappa Banda
Hello! Thanks a lot, Erick! I've attached some stack traces during a normal 'engine' running. Cheers, - Luis Cappa 2013/3/13 Erick Erickson erickerick...@gmail.com Stack traces.. First, jps -l that will give you a the process IDs of your running Java processes. Then: jstack pid from

Re: Poll: Largest SolrCloud out there?

2013-03-14 Thread Christian von Wendt-Jensen
Does it only count if you are using SolrCloud? We are using a traditional Master/Slave setup with Solr 4.1: 1 Master per 14 days: Documents: ~15mio Index size: ~150GB (stored fields) #of masters: +30 Performance: SUCKS big time until caches catches up. Unfortunately that takes quite some

Advice: solrCloud + DIH

2013-03-14 Thread roySolr
Hello, I need some advice with my solrcloud cluster and the DIH. I have a cluster with 3 cloud servers. Every server has an solr instance and a zookeeper instance. I start it with the -Dzkhost parameter. It works great, i send updates by an curl(xml) like this: curl http:/ip:SOLRport/solr/update

Re: Poll: Largest SolrCloud out there?

2013-03-14 Thread Otis Gospodnetic
Christian, SSDs will warm up muuuch faster. Your other questionable require more info / discussion. Otis Solr ElasticSearch Support http://sematext.com/ On Mar 14, 2013 8:47 AM, Christian von Wendt-Jensen christian.vonwendt-jen...@infopaq.com wrote: Does it only count if you are using

Re: OutOfMemoryError

2013-03-14 Thread Toke Eskildsen
On Thu, 2013-03-14 at 13:10 +0100, Arkadi Colson wrote: When I shutdown tomcat free -m and top keeps telling me the same values. Almost no free memory... Any idea? Are you reading top free right? It is standard behaviour for most modern operating systems to have very little free memory. As

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Mark Miller
On Mar 14, 2013, at 8:10 AM, Rafał Radecki radecki.ra...@gmail.com wrote: Is this a bug? Yes, 4.1 had some replication issues just as you seem to describe here. It all should be fixed in 4.2 which is available now and is a simple upgrade. - Mark

Re: Advice: solrCloud + DIH

2013-03-14 Thread Mark Miller
On Mar 14, 2013, at 9:22 AM, roySolr royrutten1...@gmail.com wrote: Hello, When i run this it goes with 3 doc/s(Really slow). When i run solr alone(not solrcloud) it goes 600 docs/sec. What's the best way to do a full re-index with solrcloud? Does solrcloud support DIH? Thanks

Re: OutOfMemoryError

2013-03-14 Thread Arkadi Colson
On 03/14/2013 03:11 PM, Toke Eskildsen wrote: On Thu, 2013-03-14 at 13:10 +0100, Arkadi Colson wrote: When I shutdown tomcat free -m and top keeps telling me the same values. Almost no free memory... Any idea? Are you reading top free right? It is standard behaviour for most modern

Replication

2013-03-14 Thread Arkadi Colson
Based on what does solr replicate the whole shard again from zero? From time to time after a restart of tomcat solr copies over the whole shard to the replicator instead of doing only the changes. BR, Arkadi

Question about email search

2013-03-14 Thread Jorge Luis Betancourt Gonzalez
I'm using solr 3.6.2 to crawl some data using nutch, in my schema I've one field with all the content extracted from the page, which could possibly include email addresses, this is the configuration of my schema: fieldType name=text class=solr.TextField

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread richardg
I believe this is the same issue as described, I'm running 4.2 and as you can see my slave is a couple versions ahead of the master (all three slaves show the same behavior). This was never the case until I upgraded from 4.0 to 4.2. Master: 1363272681951 93 1,022.31 MB Slave: 1363273274085 95

Re: Question about email search

2013-03-14 Thread Ahmet Arslan
Hi, Since you have word delimiter filter in your analysis chain, I am not sure if e-mail addresses are recognised. You can check that on solr admin UI, analysis page. If e-mail addresses kept one token, I would use leading wildcard query. q=*@gmail.com There was a similar question recently:

Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
Hi We have been using Solr 4.0 for a while now and wanted to upgrade to 4.2. But our application stopped working. When we tried 4.1 it was working as expected. Here is a description of the situation. We deploy a Solr web application under java 7 on a Glassfish 3.1.2.2 server. We added some

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Mark Miller
What calls are you using to get the versions? Or is it the admin UI? Also can you add any details about your setup - if this is a problem, we need to duplicate it in one of our unit tests. Also, is it affecting proper replication in any way that you can tell. - Mark On Mar 14, 2013, at 11:12

Re: Strange error in Solr 4.2

2013-03-14 Thread Mark Miller
Perhaps as a result of https://issues.apache.org/jira/browse/SOLR-4451 ? Just a guess. The root cause looks to be: Caused by: java.io.IOException: Keystore was tampered with, or password was incorrect - Mark On Mar 14, 2013, at 11:24 AM, Uwe Klosa uwe.kl...@gmail.com wrote: Hi We have

need general advice on how others version and mange core deployments over time

2013-03-14 Thread geeky2
hello everyone, i know this is a general topic - but would really appreciate info from others that are doing this now. - how are others managing this so that users are impacted the least - how are others handling the scenario where users don't want to migrate forward. thx mark --

Re: Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
Thanks, but nobody has tempered with keystores. I have tested the application on different machines. Always the same exception is thrown. Do we have to set some system property to fix this? /Uwe On 14 March 2013 16:36, Mark Miller markrmil...@gmail.com wrote: Perhaps as a result of

Handling a closed IndexWriter in SOLR 4.0

2013-03-14 Thread Danzig, Scott
Hey all, We're using a Solr 4 core to handle our article data. When someone in our CMS publishes an article, we have a listener that indexes it straight to solr. We use the previously instantiated HttpSolrServer, build the solr document, add it with server.add(doc) .. then do a

Re: Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
I found the answer myself. Thanks for the pointer. Cheers Uwe On 14 March 2013 16:48, Uwe Klosa uwe.kl...@gmail.com wrote: Thanks, but nobody has tempered with keystores. I have tested the application on different machines. Always the same exception is thrown. Do we have to set some system

Re: Replication

2013-03-14 Thread Timothy Potter
Hi Arkadi, If the update delta between the shard leader and replica 100 docs, then Solr punts and replicas the entire index. Last I heard, the 100 was hard-coded in 4.0 so is not configurable. This makes sense because the replica shouldn't be out-of-sync with the leader unless it has been

Out of Memory doing a query Solr 4.2

2013-03-14 Thread raulgrande83
Hi After doing a query to Solr to get the uniqueIds (string of 20 characters) of 700 documents in a collection, I'm getting an out of memory error using Solr 4.2. I tried to increase the JVM-Memory 1G (from 3G to 4G) however this didn't change anything. This was working on 3.5. I've moved

ids request to shard with star query are slow

2013-03-14 Thread srinir
ids request to shard with star query are slow I have a distributed solr environment and I am investigating all the request where the shard took significant amount of time. One common pattern i saw was all the ids request with q=*:* and ids=some id took around 2-3sec. i picked some shard request

Re: Strange error in Solr 4.2

2013-03-14 Thread Stefan Matheis
On Thursday, March 14, 2013 at 4:57 PM, Uwe Klosa wrote: I found the answer myself. Thanks for the pointer. Would you mind sharing you answer, Uwe?

Re: Out of Memory doing a query Solr 4.2

2013-03-14 Thread Robert Muir
On Thu, Mar 14, 2013 at 12:07 PM, raulgrande83 raulgrand...@hotmail.com wrote: JVM: IBM J9 VM(1.6.0.2.4) I don't recommend using this JVM.

Re: Strange error in Solr 4.2

2013-03-14 Thread Shawn Heisey
On 3/14/2013 9:24 AM, Uwe Klosa wrote: This exception occurs in this part new ConcurrentUpdateSolrServer(http://solr.diva-portal.org:8080/search;, 5, 50) Side comment, unrelated to your question: If you're already aware that ConcurrentUpdateSolrServer has no built-in error handling and

Re: Strange error in Solr 4.2

2013-03-14 Thread Mark Miller
On Mar 14, 2013, at 1:27 PM, Shawn Heisey s...@elyograg.org wrote: I have been told that it is possible to override the handleError method to fix this I'd say mitigate more than fix. I think the real fix requires some dev work. - Mark

Re: OutOfMemoryError

2013-03-14 Thread Shawn Heisey
On 3/14/2013 3:35 AM, Arkadi Colson wrote: Hi I'm getting this error after a few hours of filling solr with documents. Tomcat is running with -Xms1024m -Xmx4096m. Total memory of host is 12GB. Softcommits are done every second and hard commits every minute. Any idea why this is happening and

Meaning of Current in Solr Cloud Statistics

2013-03-14 Thread Michael Della Bitta
Hi everyone, Is there an official definition of the Current flag under Core Home Statistics? What would it mean if a shard leader is not Current? Thanks, Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271

Solr 4.2 mechanism proxy request error

2013-03-14 Thread yriveiro
Hi, I think that in solr 4.2 the new feature to proxy a request if the collection is not in the requested node has a bug. If I do a query with the parameter rows=0 and the node doesn't have the collection. If the parameter is rows=4 or superior then the search works as expected the curl

Re: Solr 4.2 mechanism proxy request error

2013-03-14 Thread Mark Miller
I'll add a test with rows = 0 and see how easy it is to replicate. Looks to me like you should file a JIRA issue in any case. - Mark On Mar 14, 2013, at 2:04 PM, yriveiro yago.rive...@gmail.com wrote: Hi, I think that in solr 4.2 the new feature to proxy a request if the collection is

Re: Solr 4.2 mechanism proxy request error

2013-03-14 Thread yriveiro
The log of the UI null:org.apache.solr.common.SolrException: Error trying to proxy request for url: http://192.168.20.47:8983/solr/ST-3A856BBCA3_12/select I will open the issue in Jira. Thanks - Best regards -- View this message in context:

Re: Version conflict during data import from another Solr instance into clean Solr

2013-03-14 Thread Chris Hostetter
: It looks strange to me that if there is no document yet (foundVersion 0) : then the only case when document will be imported is when input version is : negative. Guess I need to test specific cases using SolrJ or smth. to be sure. you're assuming that if foundVersion 0 that means no document

Re: Question about email search

2013-03-14 Thread Jorge Luis Betancourt Gonzalez
Sorry for the duplicated mail :-(, any advice on a configuration for searching emails in a field that does not have only email addresses, so the email addresses are contained in larger textual messages? - Mensaje original - De: Ahmet Arslan iori...@yahoo.com Para:

Searching across multiple collections (cores)

2013-03-14 Thread kfdroid
I've been looking all over for a clear answer to this question and can't seem to find one. It seems like a very basic concept to me though so maybe I'm using the wrong terminology. I want to be able to search across multiple collections (as it is now called in SolrCloud world, previously called

Re: Searching across multiple collections (cores)

2013-03-14 Thread Mark Miller
Yes, with SolrCloud, it's just the collection param (as long as the schemas are compatible for this): http://wiki.apache.org/solr/SolrCloud#Distributed_Requests - Mark On Mar 14, 2013, at 2:55 PM, kfdroid kfdr...@gmail.com wrote: I've been looking all over for a clear answer to this question

Re: Meaning of Current in Solr Cloud Statistics

2013-03-14 Thread Stefan Matheis
Hey Michael I was a bit confused because you mentioned SolrCloud in the subject. We're talking about http://host:port/solr/#/collection1 (f.e.) right? And there, the left-upper Box Statistics ? If so, the Output comes from /solr/collection1/admin/luke (

Re: Meaning of Current in Solr Cloud Statistics

2013-03-14 Thread Michael Della Bitta
Stefan, Thanks a lot! Makes sense. So I don't have to worry about my leader thinking it's out of date, then. Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game

Re: Meaning of Current in Solr Cloud Statistics

2013-03-14 Thread Stefan Matheis
Perhaps the wording of Current is a bit too generic in that context? I'd like to change that description if that clarifies things .. but not sure which one is a better fit? On Thursday, March 14, 2013 at 8:26 PM, Michael Della Bitta wrote: Stefan, Thanks a lot! Makes sense. So I don't

Re: Meaning of Current in Solr Cloud Statistics

2013-03-14 Thread Mark Miller
Something like 'Reader is Current' might be better. Personally, I don't even know if it's worth showing. - Mark On Mar 14, 2013, at 3:40 PM, Stefan Matheis matheis.ste...@gmail.com wrote: Perhaps the wording of Current is a bit too generic in that context? I'd like to change that

Solr indexing binary files

2013-03-14 Thread Luis
Hi, I am new with Solr and I am extracting metadata from binary files through URLs stored in my database. I would like to know what fields are available for indexing from PDFs (the ones that would be initiated as in column=””). For example how would I extract something like file size, format or

Re: Solr indexing binary files

2013-03-14 Thread Jack Krupansky
Take a look at Solr Cell: http://wiki.apache.org/solr/ExtractingRequestHandler Include a dynamicField with a * pattern and you will see the wide variety of metadata that is available for PDF and other rich document formats. -- Jack Krupansky -Original Message- From: Luis Sent:

Re: Question about email search

2013-03-14 Thread Alexandre Rafalovitch
Sure. copyField it into a new indexed non-stored field with the following type definition: fieldType name=address_email class=solr.TextField analyzer tokenizer class=solr.UAX29URLEmailTokenizerFactory/ filter class=solr.TypeTokenFilterFactory types=filter_email.txt

Re: Handling a closed IndexWriter in SOLR 4.0

2013-03-14 Thread Otis Gospodnetic
Hi Scott, Not sure why IW would be closed, but: * consider not (hard) committing after each doc, but just periodically, every N minutes * soft committing instead * using 4.2 Otis -- Solr ElasticSearch Support http://sematext.com/ On Thu, Mar 14, 2013 at 11:55 AM, Danzig, Scott

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Lance Norskog
Wow! That's great. And it's a lot of work, especially getting it all keyboard-complete. Thank you. On 03/14/2013 01:29 AM, Chantal Ackermann wrote: Hi all, this is not a question. I just wanted to announce that I've written a blog post on how to set up Maven for packaging and automatic

Re: Advice: solrCloud + DIH

2013-03-14 Thread rulinma
3docs/s is lower, I test with 4 node is more 1000docs/s and 4k/doc with solrcloud. Every leader has a replica. I am tuning to improve to 3000docs/s. 3docs/s is too slow. 3x! -- View this message in context: http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339p4047559.html Sent

Re: Embedded Solr

2013-03-14 Thread rulinma
give u to test embeded solr: import java.io.File; import java.io.IOException; import java.net.MalformedURLException; import java.util.ArrayList; import java.util.Collection; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.core.SimpleAnalyzer; import

Re: discovery-based core enumeration with embedded solr

2013-03-14 Thread Erick Erickson
H, could you raise a JIRA and assign it to me? Please be sure and emphasize that it's embedded because I'm pretty sure this is fine for the regular case. But I have to admit that the embedded case completely slipped under the radar. Even better if you could make a test case, but that might

Re: Can we manipulate termfreq to count as 1 for multiple matches?

2013-03-14 Thread Felipe Lahti
Hi! Take a look on http://wiki.apache.org/solr/SchemaXml#Common_field_options parameter *omitTermFreqAndPositions* or you can use a custom similarity class that overrides the term freq and return one for only that field. http://wiki.apache.org/solr/SchemaXml#Similarity fieldType name=text_dfr

SOLR Num Docs vs NumFound

2013-03-14 Thread Nathan Findley
On my solr 4 setup a query returns a higher NumFound value during a *:* query than the Num Docs value reported on the statistics page of collection1. Why is that? My data is split across 3 data import handlers where each handler has the same type of data but the ids are guaranteed to be

Re: Solr Replication

2013-03-14 Thread vicky desai
Hi, I have a multi core setup and there is continuous updation going on in each core. Hence I dont prefer a bckup as it would either cause a downtime or if during a backup there is a write activity my backup will be corrupted. Can you please suggest if there is a cleaner way to handle this --