Hi Jim

 

Thanks for looking into it for me.

 

I did some more testing and if I created a base solr 7.7.1 database using
the 'out of the box' schema.xml and solrconfig and add this item manually
using the Solr Admin tool documents/XML

 

<doc>

<field name="id">ABCD-N1</field>

<field name="title_t">A test</field>

</doc>

 

And then update it using

 

<doc>

<field name="id">ABCD-N1</field>

<field name="title_t">A test updated</field>

</doc>

 

It correctly updates and deletes the old copy. 

 

I then 'migrated' it to solr 8.7.0 and updated the record in the same manner
(using documents/XML) with this 

 

<doc>

<field name="id">ABCD-N1</field>

<field name="title_t">A test updated again</field>

</doc>

 

It created a new record without deleting the old record

 

{

  "responseHeader":{

    "status":0,

    "QTime":1,

    "params":{

      "q":"*:*",

      "_":"1610703647168"}},

  "response":{"numFound":2,"start":0,"numFoundExact":true,"docs":[

      {

        "id":"ABCD-N1",

        "title_t":"A test updated",

        "_version_":1688944583266795520},

      {

        "id":"ABCD-N1",

        "title_t":"A test updated again",

        "_version_":1688950299184594944}]

  }}

 

It is almost as if the delete of the record from the segment set up 7.7.1 is
not recognised.

 

When I updated the record again using

 

<doc>

<field name="id">ABCD-N1</field>

<field name="title_t">A test updated again and again</field>

</doc>

 

It updated the newly created record  and deleted the old version.

 

{

  "responseHeader":{

    "status":0,

    "QTime":1,

    "params":{

      "q":"*:*",

      "_":"1610703647168"}},

  "response":{"numFound":2,"start":0,"numFoundExact":true,"docs":[

      {

        "id":"ABCD-N1",

        "title_t":"A test updated",

        "_version_":1688944583266795520},

      {

        "id":"ABCD-N1",

        "title_t":"A test updated again and again",

        "_version_":1688950897568120832}]

  }}

 

I did further testing by turning on lucene TRACE on my database and first
update generated

 

2021-01-15 09:38:30.138 INFO  (qtp1458091526-18) [   x:uleaf]
o.a.s.u.LoggingInfoStream [BD][qtp1458091526-18]: now apply del packet
(org.apache.solr.update.SolrIndexWriter@15e9adf2
<mailto:org.apache.solr.update.SolrIndexWriter@15e9adf2> ) to 10 segments,
mergeGen 0

2021-01-15 09:38:30.138 INFO  (qtp1458091526-18) [   x:uleaf]
o.a.s.u.LoggingInfoStream [BD][qtp1458091526-18]: applyTermDeletes took 0.44
msec for 10 segments and 1 del terms; 0 new deletions

 

Whilst the second update generated

 

2021-01-15 09:44:21.543 INFO  (qtp1458091526-17) [   x:uleaf]
o.a.s.u.LoggingInfoStream [BD][qtp1458091526-17]: now apply del packet
(org.apache.solr.update.SolrIndexWriter@15e9adf2
<mailto:org.apache.solr.update.SolrIndexWriter@15e9adf2> ) to 11 segments,
mergeGen 0

2021-01-15 09:44:21.544 INFO  (qtp1458091526-17) [   x:uleaf]
o.a.s.u.LoggingInfoStream [BD][qtp1458091526-17]: applyTermDeletes took 0.29
msec for 11 segments and 1 del terms; 1 new deletions

 

 

I think that it does not seem to find the document to delete in the old
segment.

 

Could this be a bug in Solr?

 

Many thanks

 

Matthew

 

Matthew Flowerday | Consultant | ULEAF

Unisys | 01908 774830|  <mailto:matthew.flower...@unisys.com>
matthew.flower...@unisys.com 

Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17
8LX

 

 <http://www.unisys.com/> 

 

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is for use only by the intended recipient. If you received this
in error, please contact the sender and delete the e-mail and its
attachments from all devices.

 <http://www.linkedin.com/company/unisys>    <http://twitter.com/unisyscorp>
<http://www.youtube.com/theunisyschannel>
<http://www.facebook.com/unisyscorp>  <https://vimeo.com/unisys>
<http://blogs.unisys.com/> 

 

From: Dyer, Jim <james.d...@ingramcontent.com> 
Sent: 13 January 2021 18:21
To: solr-user@lucene.apache.org
Subject: RE: Query over migrating a solr database from 7.7.1 to 8.7.0

 

EXTERNAL EMAIL - Be cautious of all links and attachments.

I think if you have _root_ in schema.xml you should look elsewhere.  My
memory is merely adding this one line to schema.xml took care of our
problem.

 

From: Flowerday, Matthew J <matthew.flower...@gb.unisys.com
<mailto:matthew.flower...@gb.unisys.com> > 
Sent: Tuesday, January 12, 2021 3:23 AM
To: solr-user@lucene.apache.org <mailto:solr-user@lucene.apache.org> 
Subject: RE: Query over migrating a solr database from 7.7.1 to 8.7.0

 

Hi Jim

 

Thanks for getting back to me.

 

I checked the schema.xml that we are using and it has the line you
mentioned:

 

    <field name="_root_" type="string" indexed="true" stored="false"
docValues="false" />

 

And this is the only reference (apart from within a comment) for _root_ In
the schema.xml. Does your schema.xml have further references to _root_ that
I could need? I also checked out solrconfig.xml file for any references to
_root_ and there are none.

 

Many Thanks

 

Matthew

 

Matthew Flowerday | Consultant | ULEAF

Unisys | 01908 774830|  <mailto:matthew.flower...@unisys.com>
matthew.flower...@unisys.com 

Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17
8LX

 

 <http://www.unisys.com/> 

 

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is for use only by the intended recipient. If you received this
in error, please contact the sender and delete the e-mail and its
attachments from all devices.

 <http://www.linkedin.com/company/unisys>    <http://twitter.com/unisyscorp>
<http://www.youtube.com/theunisyschannel>
<http://www.facebook.com/unisyscorp>  <https://vimeo.com/unisys>
<http://blogs.unisys.com/> 

 

From: Dyer, Jim <james.d...@ingramcontent.com
<mailto:james.d...@ingramcontent.com> > 
Sent: 11 January 2021 22:58
To: solr-user@lucene.apache.org <mailto:solr-user@lucene.apache.org> 
Subject: RE: Query over migrating a solr database from 7.7.1 to 8.7.0

 

EXTERNAL EMAIL - Be cautious of all links and attachments.

When we upgraded from 7.x to 8.x, I ran into an issue similar to yours:
when updating an existing document in the index, the document would be
duplicated instead of replaced as expected.  The solution was to add a
"_root_" field to schema.xml like this:

 

<field name="_root_" type="string" indexed="true" stored="false"
docValues="false" />

 

It appeared that when a feature was added for nested documents, this field
somehow became mandatory in order for updates to work properly, at least in
some cases.

 

From: Flowerday, Matthew J <matthew.flower...@gb.unisys.com
<mailto:matthew.flower...@gb.unisys.com> > 
Sent: Saturday, January 9, 2021 4:44 AM
To: solr-user@lucene.apache.org <mailto:solr-user@lucene.apache.org> 
Subject: RE: Query over migrating a solr database from 7.7.1 to 8.7.0

 

Hi There

 

As a test I stopped Solr and ran the IndexUpgrader tool on the database to
see if this might fix the issue. It completed OK but unfortunately the issue
still occurs - a new version of the record on solr is created rather than
updating the original record.

 

It looks to me as if the record created under 7.7.1 is somehow not being
'marked as deleted' in the way that records created under 8.7.0 are. Is
there a way for these records to be marked as deleted when they are updated.

 

Many Thanks

 

Matthew

 

 

Matthew Flowerday | Consultant | ULEAF

Unisys | 01908 774830|  <mailto:matthew.flower...@unisys.com>
matthew.flower...@unisys.com 

Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17
8LX

 

 <http://www.unisys.com/> 

 

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is for use only by the intended recipient. If you received this
in error, please contact the sender and delete the e-mail and its
attachments from all devices.

 <http://www.linkedin.com/company/unisys>    <http://twitter.com/unisyscorp>
<http://www.youtube.com/theunisyschannel>
<http://www.facebook.com/unisyscorp>  <https://vimeo.com/unisys>
<http://blogs.unisys.com/> 

 

From: Flowerday, Matthew J <matthew.flower...@gb.unisys.com
<mailto:matthew.flower...@gb.unisys.com> > 
Sent: 07 January 2021 12:25
To: solr-user@lucene.apache.org <mailto:solr-user@lucene.apache.org> 
Subject: Query over migrating a solr database from 7.7.1 to 8.7.0

 

Hi There

 

I have recently upgraded a solr database from 7.7.1 to 8.7.0 and not wiped
the database and re-indexed (as this would take too long to run on site).

 

On my local windows machine I have a single solr server 7.7.1 installation

 

I upgraded in the following manner

 

*       Installed windows solr 8.7.0 on my machine in a different folder
*       Copied the core related folder (holding conf, data, lib,
core.properties) from 7.7.1 to the new 8.7.0 folder
*       Brought up the solr
*       Checked that queries work through the Solr Admin Tool and our
application

 

This all worked fine until I tried to update a record which had been created
under 7.7.1. Instead of marking the old record as deleted it effectively
created a new copy of the record with the change in and left the old image
as still visible. When I updated the record again it then correctly updated
the new 8.7.0 version without leaving the old image behind. If I created a
new record and then updated it the solr record would be updated correctly.
The issue only seemed to affect the old 7.7.1 created records.

 

An example of the duplication as follows (the first record is 7.7.1 created
version and the second record is the 8.7.0 version after carrying out an
update):

 

{

  "responseHeader":{

    "status":0,

    "QTime":4,

    "params":{

      "q":"id:9901020319M01-N26",

      "_":"1610016003669"}},

  "response":{"numFound":2,"start":0,"numFoundExact":true,"docs":[

      {

        "id":"9901020319M01-N26",

        "groupId":"9901020319M01",

        "urn":"N26",

        "specification":"nominal",

        "owningGroupId":"9901020319M01",

        "description":"N26, Yates, Mike, Alan, Richard, MALE",

        "group_t":"9901020319M01",

        "nominalUrn_t":"N26",

        "dateTimeCreated_dtr":"2020-12-30T12:00:53Z",

        "dateTimeCreated_dt":"2020-12-30T12:00:53Z",

        "title_t":"Captain",

        "surname_t":"Yates",

        "qualifier_t":"Voyager",

        "forename1_t":"Mike",

        "forename2_t":"Alan",

        "forename3_t":"Richard",

        "sex_t":"MALE",

        "orderedType_t":"Nominal",

        "_version_":1687507566832123904},

      {

        "id":"9901020319M01-N26",

        "groupId":"9901020319M01",

        "urn":"N26",

        "specification":"nominal",

        "owningGroupId":"9901020319M01",

        "description":"N26, Yates, Mike, Alan, Richard, MALE",

        "group_t":"9901020319M01",

        "nominalUrn_t":"N26",

        "dateTimeCreated_dtr":"2020-12-30T12:00:53Z",

        "dateTimeCreated_dt":"2020-12-30T12:00:53Z",

        "title_t":"Captain",

        "surname_t":"Yates",

        "qualifier_t":"Voyager enterprise defiant yorktown xx yy",

        "forename1_t":"Mike",

        "forename2_t":"Alan",

        "forename3_t":"Richard",

        "sex_t":"MALE",

        "orderedType_t":"Nominal",

        "_version_":1688224966566215680}]

  }}

 

I checked the solrconfig.xml file and it does have a uniqueKey set up

 

              <field name="id" type="string" indexed="true" stored="true"
required="true" multiValued="false" />

                  

<uniqueKey>id</uniqueKey>

 

I was wondering if this behaviour is expected and if there is a way to make
sure that records created under a previous version are updated correctly (so
that the old data is deleted when updated).

 

Also am I upgrading solr correctly as it could be that the way I have
upgraded it might be causing this issue (I tried hunting through the solr
documentation online but struggled to find window upgrade notes and the
above steps I worked out by trial and error).

 

Many thanks

 

Matthew

 

Matthew Flowerday | Consultant | ULEAF

Unisys | 01908 774830|  <mailto:matthew.flower...@unisys.com>
matthew.flower...@unisys.com 

Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17
8LX

 

 <http://www.unisys.com/> 

 

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is for use only by the intended recipient. If you received this
in error, please contact the sender and delete the e-mail and its
attachments from all devices.

 <http://www.linkedin.com/company/unisys>    <http://twitter.com/unisyscorp>
<http://www.youtube.com/theunisyschannel>
<http://www.facebook.com/unisyscorp>  <https://vimeo.com/unisys>
<http://blogs.unisys.com/> 

 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to