Re: Problem with new solr.xml format and core swaps

2015-04-08 Thread Erick Erickson
Well, at least it's _some_  progress ;).

Agreed, the segments hanging around is still something of a mystery
although if I really stretch I could relate them, maybe.

I believe there's clean-up logic when a core starts up to nuke cruft
in the index directory. If the cruft was created after a core swap on
the core where Solr couldn't write the core.properties file, then when
the core started back up is it possible that it was looking in the
wrong directory to clean stuff up. This a total and complete guess
though as I don't know that bit of code so

If the undeleted segment files were in a directory related to the core
whose core.properties file wasn't persisted, that would lend some
credence to the idea though.

FWIW,
Erick

On Tue, Apr 7, 2015 at 12:18 PM, Shawn Heisey  wrote:
> On 4/7/2015 10:54 AM, Erick Erickson wrote:
>> I'm pretty clueless why you would be seeing this, and slammed with
>> other stuff so I can't dig into this right now.
>>
>> What do the "core.properties" files look like when you see this? They
>> should be re-written when you swap cores. Hmmm, I wonder if there's
>> some condition where the files are already open and the persistence
>> fails? If so we should be logging that error, I have no proof either
>> way whether we are or not though.
>>
>> Guessing that your log files in the problem case weren't all that
>> helpful, but let's have a look at them if this occurs again?
>
> I hadn't had a chance to review the logs, but when I did just now, I
> found this:
>
> ERROR - 2015-04-07 11:56:15.568;
> org.apache.solr.core.CorePropertiesLocator; Couldn't persist core
> properties to /index/solr4/cores/sparkinc_0/core.properties:
> java.io.FileNotFoundException:
> /index/solr4/cores/sparkinc_0/core.properties (Permission denied)
>
> That's fairly clear.  I guess my permissions were wrong.  My best guess
> as to why -- things owned by root from when I created the
> core.properties files.  Solr does not run as root.  I didn't think to
> actually look at the permissions before I ran a script that I maintain
> which fixes all the ownership on my various directories involved in my
> full search installation.
>
> I don't think this explains the not-deleted segment files problem.
> Those segment files were written by solr running as the regular user, so
> there couldn't have been a permission problem.
>
> Thanks,
> Shawn
>


Re: Problem with new solr.xml format and core swaps

2015-04-07 Thread Shawn Heisey
On 4/7/2015 10:54 AM, Erick Erickson wrote:
> I'm pretty clueless why you would be seeing this, and slammed with
> other stuff so I can't dig into this right now.
>
> What do the "core.properties" files look like when you see this? They
> should be re-written when you swap cores. Hmmm, I wonder if there's
> some condition where the files are already open and the persistence
> fails? If so we should be logging that error, I have no proof either
> way whether we are or not though.
>
> Guessing that your log files in the problem case weren't all that
> helpful, but let's have a look at them if this occurs again?

I hadn't had a chance to review the logs, but when I did just now, I
found this:

ERROR - 2015-04-07 11:56:15.568;
org.apache.solr.core.CorePropertiesLocator; Couldn't persist core
properties to /index/solr4/cores/sparkinc_0/core.properties:
java.io.FileNotFoundException:
/index/solr4/cores/sparkinc_0/core.properties (Permission denied)

That's fairly clear.  I guess my permissions were wrong.  My best guess
as to why -- things owned by root from when I created the
core.properties files.  Solr does not run as root.  I didn't think to
actually look at the permissions before I ran a script that I maintain
which fixes all the ownership on my various directories involved in my
full search installation.

I don't think this explains the not-deleted segment files problem. 
Those segment files were written by solr running as the regular user, so
there couldn't have been a permission problem.

Thanks,
Shawn



Re: Problem with new solr.xml format and core swaps

2015-04-07 Thread Erick Erickson
Shawn:

I'm pretty clueless why you would be seeing this, and slammed with
other stuff so I can't dig into this right now.

What do the "core.properties" files look like when you see this? They
should be re-written when you swap cores. Hmmm, I wonder if there's
some condition where the files are already open and the persistence
fails? If so we should be logging that error, I have no proof either
way whether we are or not though.

Guessing that your log files in the problem case weren't all that
helpful, but let's have a look at them if this occurs again?

Sorry I can't be more help
Erick

On Mon, Apr 6, 2015 at 8:38 PM, Shawn Heisey  wrote:
> On 4/6/2015 6:40 PM, Erick Erickson wrote:
>> What version are you migrating _from_? 4.9.0? There were some
>> persistence issues at one point, but AFAIK they were fixed by 4.9, I
>> can check if you're on an earlier version...
>
> Effectively there is no previous version.  Whenever I upgrade, I delete
> all the data directories and completely reindex.  When I converted from
> the old solr.xml to core discovery, the server was already on 4.9.1.
>
> Thanks,
> Shawn
>


Re: Problem with new solr.xml format and core swaps

2015-04-06 Thread Shawn Heisey
On 4/6/2015 6:40 PM, Erick Erickson wrote:
> What version are you migrating _from_? 4.9.0? There were some
> persistence issues at one point, but AFAIK they were fixed by 4.9, I
> can check if you're on an earlier version...

Effectively there is no previous version.  Whenever I upgrade, I delete
all the data directories and completely reindex.  When I converted from
the old solr.xml to core discovery, the server was already on 4.9.1.

Thanks,
Shawn



Re: Problem with new solr.xml format and core swaps

2015-04-06 Thread Erick Erickson
Shawn:

What version are you migrating _from_? 4.9.0? There were some
persistence issues at one point, but AFAIK they were fixed by 4.9, I
can check if you're on an earlier version...

Erick

On Sun, Apr 5, 2015 at 2:05 PM, Shawn Heisey  wrote:
> I'm having two problems with Solr 4.9.1.  I can't upgrade yet, because
> we are using a third-party plugin component that is not yet explicitly
> qualified for anything newer than 4.9.0.  The point release upgrade
> seemed like a safe bet, because I know that we don't do API changes in
> point releases.  These are transient problems, and do not seem to be
> affecting the index at this time.
>
> Some background info:
>
> Ubuntu 14, Java 8u40 from the webupd8 PPA, Solr 4.9.1.  It is *NOT*
> SolrCloud.
>
> Full rebuilds on my index involve building a new index in cores that I
> have designated "build" cores, then swapping those cores with "live"
> cores.  This always worked flawlessly before I updated to Solr 4.9.1 and
> migrated the config to use core discovery.
>
> root@idxb4:~# cat /index/solr4/cores/sparkinc_0/core.properties
> name=sparkinclive
> dataDir=../../data/sparkinc_0
>
> root@idxb4:~# cat /index/solr4/cores/sparkinc_1/core.properties
> name=sparkincbuild
> dataDir=../../data/sparkinc_1
>
> The first problem:  Sometimes, in a completely unpredictable manner, the
> new solr.xml format seems to behave like using the old format with
> persistent=false.
>
> When I restarted Solr yesterday, that action swapped the live cores with
> the build cores and I lost half my index because it swapped back to the
> previous build cores.  Just now when I tried a restart, everything
> worked flawlessly and the cores did not swap.
>
> The second problem:  Sometimes old index segments do not get deleted,
> even though they are not part of the index.
>
> Another part of the full rebuild process involves clearing the build
> cores before beginning the full import.  The code does a deleteByQuery
> with *:* and then optimizes the core.  Sometimes this action fails to
> delete the old segment files, but when I checked the core Overview in
> the admin UI, numDocs only reflected the newly indexed docs and
> deletedDocs was 0.
>
> It was actually while trying to fix/debug this second problem that I
> discovered the first problem.  Once the rebuild finished, I wanted to
> see what would happen if I restarted Solr while one of my cores had 32GB
> of segment files that were not part of the index ... but that's when the
> indexes swapped.  At that point, I deleted all the dataDirs on both
> machines (it's a distributed index), restarted Solr again, and began a
> full rebuild.  Everything seems to be fine now.
>
> Are either of these problems anything that anyone has seen?  I don't
> recall seeing anything come across the list before.  Are there existing
> issues in Jira?  Is there any information that I can provide which would
> help in narrowing down the problem?
>
> Thanks,
> Shawn
>


Problem with new solr.xml format and core swaps

2015-04-05 Thread Shawn Heisey
I'm having two problems with Solr 4.9.1.  I can't upgrade yet, because
we are using a third-party plugin component that is not yet explicitly
qualified for anything newer than 4.9.0.  The point release upgrade
seemed like a safe bet, because I know that we don't do API changes in
point releases.  These are transient problems, and do not seem to be
affecting the index at this time.

Some background info:

Ubuntu 14, Java 8u40 from the webupd8 PPA, Solr 4.9.1.  It is *NOT*
SolrCloud.

Full rebuilds on my index involve building a new index in cores that I
have designated "build" cores, then swapping those cores with "live"
cores.  This always worked flawlessly before I updated to Solr 4.9.1 and
migrated the config to use core discovery.

root@idxb4:~# cat /index/solr4/cores/sparkinc_0/core.properties
name=sparkinclive
dataDir=../../data/sparkinc_0

root@idxb4:~# cat /index/solr4/cores/sparkinc_1/core.properties
name=sparkincbuild
dataDir=../../data/sparkinc_1

The first problem:  Sometimes, in a completely unpredictable manner, the
new solr.xml format seems to behave like using the old format with
persistent=false.

When I restarted Solr yesterday, that action swapped the live cores with
the build cores and I lost half my index because it swapped back to the
previous build cores.  Just now when I tried a restart, everything
worked flawlessly and the cores did not swap.

The second problem:  Sometimes old index segments do not get deleted,
even though they are not part of the index.

Another part of the full rebuild process involves clearing the build
cores before beginning the full import.  The code does a deleteByQuery
with *:* and then optimizes the core.  Sometimes this action fails to
delete the old segment files, but when I checked the core Overview in
the admin UI, numDocs only reflected the newly indexed docs and
deletedDocs was 0.

It was actually while trying to fix/debug this second problem that I
discovered the first problem.  Once the rebuild finished, I wanted to
see what would happen if I restarted Solr while one of my cores had 32GB
of segment files that were not part of the index ... but that's when the
indexes swapped.  At that point, I deleted all the dataDirs on both
machines (it's a distributed index), restarted Solr again, and began a
full rebuild.  Everything seems to be fine now.

Are either of these problems anything that anyone has seen?  I don't
recall seeing anything come across the list before.  Are there existing
issues in Jira?  Is there any information that I can provide which would
help in narrowing down the problem?

Thanks,
Shawn