There may also be a way to drop a bunch of fields on intake by crafting a
custom update request processor chain in solrconfig.xml.
Or by temporarily declaring them with stored=false, indexed=false in the
target schema.
As long as nothing actually ends up in Lucene segments, you can change
schema definition later.
Regards,
Alex
On Mon., Mar. 22, 2021, 12:25 p.m. Karl Stoney,
<[email protected]> wrote:
> Interestingly enough the next issue we hit is the `fl=750 fields` == too
> big, so we switched to using POST/x-url-form-encoded for REINDEXCOLLECTION
> which accepts the request, but then it silently fails (no logs in solr,
> just doesn't work).
> I can only assume this is because behind the scenes the daemon is using
> GET and hitting the same url limit (but the error is being swallowed).
>
> Significantly increasing the max http header length in jetty resolved the
> issue so this feels like a bit of a bug?
>
>
> On 22/03/2021, 15:37, "Karl Stoney" <[email protected]>
> wrote:
>
> So for context we have 900x fields on Collection one and have removed
> some 250 fields from the schema and want to reindex into collection2.
> We're trying to have a process where we can easily remove fields and
> reindex without too much coding overhead. Therefore, we were simply using
> the default `fl=*:*` from the first collection as the assumption was that
> when we try and save the document to collection2, the additional fields
> would just be ignored. This wasn't the case.
>
> Subsequently to work around this now we read the schema.xml for
> collection2 and build up a `fl` to pass to REINDEXCOLLECTION which only
> includes the fields in collection2's schema. Which works.
>
>
> On 22/03/2021, 13:20, "David Hastings" <[email protected]>
> wrote:
>
> >Surely this field should simply just be ignored?
>
> why would solr ignore this field if you're trying to index to it?
> can't
> you change your indexer to remove these fields as well? solr will
> try to
> do what its told, and if its told to do something bad it will
> simply fail,
> you dont want it to ignore errors or bad indexing
>
> On Mon, Mar 22, 2021 at 9:15 AM David Smiley <[email protected]>
> wrote:
>
> >
> >
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsolr.apache.org%2Fguide%2F8_8%2Fcollection-management.html%23reindexcollection&data=04%7C01%7CKarl.Stoney%40autotrader.co.uk%7C16d7a53cccdf4bb545ac08d8ed484cf1%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637520242513742368%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=5c6Do5%2FRPNVFdgdgm7kKw97O3xgvyBiMOhJcXDupP9o%3D&reserved=0
> >
> > See the "fl" param
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> >
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.linkedin.com%2Fin%2Fdavidwsmiley&data=04%7C01%7CKarl.Stoney%40autotrader.co.uk%7C16d7a53cccdf4bb545ac08d8ed484cf1%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637520242513752345%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ATyR1bkLE8Mt9F3oreaVmToKGC6Hzl6RvolmbJA997w%3D&reserved=0
> >
> >
> > On Mon, Mar 22, 2021 at 9:01 AM Karl Stoney
> > <[email protected]> wrote:
> >
> > > Hi,
> > > Sorry for all the questions recently…
> > >
> > > So as per
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsolr.apache.org%2Fguide%2F8_0%2Freindexing.html&data=04%7C01%7CKarl.Stoney%40autotrader.co.uk%7C16d7a53cccdf4bb545ac08d8ed484cf1%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637520242513752345%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=tJ74flsz5IxGN72KI9A4Z%2BurJHRz1QXSpENBlanh%2Fv4%3D&reserved=0;
> we’re
> > trying
> > > to remove a load of fields. Subsequently we’ve created a new
> collection
> > > with the new schema and we’re attempting to reindex from old
> to new.
> > > There’s about 216 fields in total being removed…
> > >
> > > The REINDEX fails though, because the field has been removed:
> > >
> > > 2:52:31.508
> > >
> >
> [DaemonStream-at-uk-003-12889-thread-1-processing-n:solr-0.search-solr.svc.cluster.local:80_solr
> > > x:at-uk-002_shard1_replica_n1 c:at-uk-002 s:shard1
> r:core_node2] ERROR
> > > org.apache.solr.client.solrj.io.stream.DaemonStream - Fatal
> Error in
> > > DaemonStream: at-uk-003
> > >
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> > > Error from server at
> > >
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsolr-0.search-solr.svc.cluster.local%2Fsolr%2Fat-uk-003&data=04%7C01%7CKarl.Stoney%40autotrader.co.uk%7C16d7a53cccdf4bb545ac08d8ed484cf1%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637520242513752345%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=eQIWOrab8PPiOQSXvcOMPMJavUCm9KFVEHaKcbQ7v9M%3D&reserved=0:
> ERROR:
> > > [doc=PVc175f0f12f8f43789a7c1863e10229cd] unknown field
> > > 'OEM_ENGINE_MARKETING'
> > >
> > > Surely this field should simply just be ignored?
> > >
> > > I can’t see any way of using REINDEXCOLLECTION to reindex data
> from one
> > > collection to another where we have removed fields.
> > >
> > > Any input would be appreciated.
> > > Unless expressly stated otherwise in this email, this e-mail
> is sent on
> > > behalf of Auto Trader Limited Registered Office: 1 Tony Wilson
> Place,
> > > Manchester, Lancashire, M15 4FN (Registered in England No.
> 03909628).
> > Auto
> > > Trader Limited is part of the Auto Trader Group Plc group.
> This email and
> > > any files transmitted with it are confidential and may be
> legally
> > > privileged, and intended solely for the use of the individual
> or entity
> > to
> > > whom they are addressed. If you have received this email in
> error please
> > > notify the sender. This email message has been swept for the
> presence of
> > > computer viruses.
> > >
> >
>
> Unless expressly stated otherwise in this email, this e-mail is sent
> on behalf of Auto Trader Limited Registered Office: 1 Tony Wilson Place,
> Manchester, Lancashire, M15 4FN (Registered in England No. 03909628). Auto
> Trader Limited is part of the Auto Trader Group Plc group. This email and
> any files transmitted with it are confidential and may be legally
> privileged, and intended solely for the use of the individual or entity to
> whom they are addressed. If you have received this email in error please
> notify the sender. This email message has been swept for the presence of
> computer viruses.
>
> Unless expressly stated otherwise in this email, this e-mail is sent on
> behalf of Auto Trader Limited Registered Office: 1 Tony Wilson Place,
> Manchester, Lancashire, M15 4FN (Registered in England No. 03909628). Auto
> Trader Limited is part of the Auto Trader Group Plc group. This email and
> any files transmitted with it are confidential and may be legally
> privileged, and intended solely for the use of the individual or entity to
> whom they are addressed. If you have received this email in error please
> notify the sender. This email message has been swept for the presence of
> computer viruses.
>