Re: Problem with new solr.xml format and core swaps
Well, at least it's _some_ progress ;). Agreed, the segments hanging around is still something of a mystery although if I really stretch I could relate them, maybe. I believe there's clean-up logic when a core starts up to nuke cruft in the index directory. If the cruft was created after a core swap on the core where Solr couldn't write the core.properties file, then when the core started back up is it possible that it was looking in the wrong directory to clean stuff up. This a total and complete guess though as I don't know that bit of code so If the undeleted segment files were in a directory related to the core whose core.properties file wasn't persisted, that would lend some credence to the idea though. FWIW, Erick On Tue, Apr 7, 2015 at 12:18 PM, Shawn Heisey wrote: > On 4/7/2015 10:54 AM, Erick Erickson wrote: >> I'm pretty clueless why you would be seeing this, and slammed with >> other stuff so I can't dig into this right now. >> >> What do the "core.properties" files look like when you see this? They >> should be re-written when you swap cores. Hmmm, I wonder if there's >> some condition where the files are already open and the persistence >> fails? If so we should be logging that error, I have no proof either >> way whether we are or not though. >> >> Guessing that your log files in the problem case weren't all that >> helpful, but let's have a look at them if this occurs again? > > I hadn't had a chance to review the logs, but when I did just now, I > found this: > > ERROR - 2015-04-07 11:56:15.568; > org.apache.solr.core.CorePropertiesLocator; Couldn't persist core > properties to /index/solr4/cores/sparkinc_0/core.properties: > java.io.FileNotFoundException: > /index/solr4/cores/sparkinc_0/core.properties (Permission denied) > > That's fairly clear. I guess my permissions were wrong. My best guess > as to why -- things owned by root from when I created the > core.properties files. Solr does not run as root. I didn't think to > actually look at the permissions before I ran a script that I maintain > which fixes all the ownership on my various directories involved in my > full search installation. > > I don't think this explains the not-deleted segment files problem. > Those segment files were written by solr running as the regular user, so > there couldn't have been a permission problem. > > Thanks, > Shawn >
Re: Problem with new solr.xml format and core swaps
On 4/7/2015 10:54 AM, Erick Erickson wrote: > I'm pretty clueless why you would be seeing this, and slammed with > other stuff so I can't dig into this right now. > > What do the "core.properties" files look like when you see this? They > should be re-written when you swap cores. Hmmm, I wonder if there's > some condition where the files are already open and the persistence > fails? If so we should be logging that error, I have no proof either > way whether we are or not though. > > Guessing that your log files in the problem case weren't all that > helpful, but let's have a look at them if this occurs again? I hadn't had a chance to review the logs, but when I did just now, I found this: ERROR - 2015-04-07 11:56:15.568; org.apache.solr.core.CorePropertiesLocator; Couldn't persist core properties to /index/solr4/cores/sparkinc_0/core.properties: java.io.FileNotFoundException: /index/solr4/cores/sparkinc_0/core.properties (Permission denied) That's fairly clear. I guess my permissions were wrong. My best guess as to why -- things owned by root from when I created the core.properties files. Solr does not run as root. I didn't think to actually look at the permissions before I ran a script that I maintain which fixes all the ownership on my various directories involved in my full search installation. I don't think this explains the not-deleted segment files problem. Those segment files were written by solr running as the regular user, so there couldn't have been a permission problem. Thanks, Shawn
Re: Problem with new solr.xml format and core swaps
Shawn: I'm pretty clueless why you would be seeing this, and slammed with other stuff so I can't dig into this right now. What do the "core.properties" files look like when you see this? They should be re-written when you swap cores. Hmmm, I wonder if there's some condition where the files are already open and the persistence fails? If so we should be logging that error, I have no proof either way whether we are or not though. Guessing that your log files in the problem case weren't all that helpful, but let's have a look at them if this occurs again? Sorry I can't be more help Erick On Mon, Apr 6, 2015 at 8:38 PM, Shawn Heisey wrote: > On 4/6/2015 6:40 PM, Erick Erickson wrote: >> What version are you migrating _from_? 4.9.0? There were some >> persistence issues at one point, but AFAIK they were fixed by 4.9, I >> can check if you're on an earlier version... > > Effectively there is no previous version. Whenever I upgrade, I delete > all the data directories and completely reindex. When I converted from > the old solr.xml to core discovery, the server was already on 4.9.1. > > Thanks, > Shawn >
Re: Problem with new solr.xml format and core swaps
On 4/6/2015 6:40 PM, Erick Erickson wrote: > What version are you migrating _from_? 4.9.0? There were some > persistence issues at one point, but AFAIK they were fixed by 4.9, I > can check if you're on an earlier version... Effectively there is no previous version. Whenever I upgrade, I delete all the data directories and completely reindex. When I converted from the old solr.xml to core discovery, the server was already on 4.9.1. Thanks, Shawn
Re: Problem with new solr.xml format and core swaps
Shawn: What version are you migrating _from_? 4.9.0? There were some persistence issues at one point, but AFAIK they were fixed by 4.9, I can check if you're on an earlier version... Erick On Sun, Apr 5, 2015 at 2:05 PM, Shawn Heisey wrote: > I'm having two problems with Solr 4.9.1. I can't upgrade yet, because > we are using a third-party plugin component that is not yet explicitly > qualified for anything newer than 4.9.0. The point release upgrade > seemed like a safe bet, because I know that we don't do API changes in > point releases. These are transient problems, and do not seem to be > affecting the index at this time. > > Some background info: > > Ubuntu 14, Java 8u40 from the webupd8 PPA, Solr 4.9.1. It is *NOT* > SolrCloud. > > Full rebuilds on my index involve building a new index in cores that I > have designated "build" cores, then swapping those cores with "live" > cores. This always worked flawlessly before I updated to Solr 4.9.1 and > migrated the config to use core discovery. > > root@idxb4:~# cat /index/solr4/cores/sparkinc_0/core.properties > name=sparkinclive > dataDir=../../data/sparkinc_0 > > root@idxb4:~# cat /index/solr4/cores/sparkinc_1/core.properties > name=sparkincbuild > dataDir=../../data/sparkinc_1 > > The first problem: Sometimes, in a completely unpredictable manner, the > new solr.xml format seems to behave like using the old format with > persistent=false. > > When I restarted Solr yesterday, that action swapped the live cores with > the build cores and I lost half my index because it swapped back to the > previous build cores. Just now when I tried a restart, everything > worked flawlessly and the cores did not swap. > > The second problem: Sometimes old index segments do not get deleted, > even though they are not part of the index. > > Another part of the full rebuild process involves clearing the build > cores before beginning the full import. The code does a deleteByQuery > with *:* and then optimizes the core. Sometimes this action fails to > delete the old segment files, but when I checked the core Overview in > the admin UI, numDocs only reflected the newly indexed docs and > deletedDocs was 0. > > It was actually while trying to fix/debug this second problem that I > discovered the first problem. Once the rebuild finished, I wanted to > see what would happen if I restarted Solr while one of my cores had 32GB > of segment files that were not part of the index ... but that's when the > indexes swapped. At that point, I deleted all the dataDirs on both > machines (it's a distributed index), restarted Solr again, and began a > full rebuild. Everything seems to be fine now. > > Are either of these problems anything that anyone has seen? I don't > recall seeing anything come across the list before. Are there existing > issues in Jira? Is there any information that I can provide which would > help in narrowing down the problem? > > Thanks, > Shawn >
Problem with new solr.xml format and core swaps
I'm having two problems with Solr 4.9.1. I can't upgrade yet, because we are using a third-party plugin component that is not yet explicitly qualified for anything newer than 4.9.0. The point release upgrade seemed like a safe bet, because I know that we don't do API changes in point releases. These are transient problems, and do not seem to be affecting the index at this time. Some background info: Ubuntu 14, Java 8u40 from the webupd8 PPA, Solr 4.9.1. It is *NOT* SolrCloud. Full rebuilds on my index involve building a new index in cores that I have designated "build" cores, then swapping those cores with "live" cores. This always worked flawlessly before I updated to Solr 4.9.1 and migrated the config to use core discovery. root@idxb4:~# cat /index/solr4/cores/sparkinc_0/core.properties name=sparkinclive dataDir=../../data/sparkinc_0 root@idxb4:~# cat /index/solr4/cores/sparkinc_1/core.properties name=sparkincbuild dataDir=../../data/sparkinc_1 The first problem: Sometimes, in a completely unpredictable manner, the new solr.xml format seems to behave like using the old format with persistent=false. When I restarted Solr yesterday, that action swapped the live cores with the build cores and I lost half my index because it swapped back to the previous build cores. Just now when I tried a restart, everything worked flawlessly and the cores did not swap. The second problem: Sometimes old index segments do not get deleted, even though they are not part of the index. Another part of the full rebuild process involves clearing the build cores before beginning the full import. The code does a deleteByQuery with *:* and then optimizes the core. Sometimes this action fails to delete the old segment files, but when I checked the core Overview in the admin UI, numDocs only reflected the newly indexed docs and deletedDocs was 0. It was actually while trying to fix/debug this second problem that I discovered the first problem. Once the rebuild finished, I wanted to see what would happen if I restarted Solr while one of my cores had 32GB of segment files that were not part of the index ... but that's when the indexes swapped. At that point, I deleted all the dataDirs on both machines (it's a distributed index), restarted Solr again, and began a full rebuild. Everything seems to be fine now. Are either of these problems anything that anyone has seen? I don't recall seeing anything come across the list before. Are there existing issues in Jira? Is there any information that I can provide which would help in narrowing down the problem? Thanks, Shawn