That's really good and helpful info, thank you. Perfect.

Best wishes,

Edd

On Mon, 28 Sep 2020, 5:53 pm Shawn Heisey, <apa...@elyograg.org> wrote:

> On 9/28/2020 8:56 AM, Edward Turner wrote:
> > By removing the copyfields, we've found that our index sizes have reduced
> > by ~40% in some cases, which is great! We're just curious now as to
> exactly
> > how this can be ...
>
> That's not surprising.
>
> > My question is, given the following two schemas, if we index some data to
> > the "description" field, will the index for schema1 be twice as large as
> > the index of schema2? (I guess this relates to how, internally, Solr
> stores
> > field + index data)
> >
> > Old way -- schema1:
> > =======
> > <field name="description type="text_general" indexed="true"
> > multiValued="false"/>
> > <field name="default_field" type="text_general" indexed="true"
> > multiValued="false" />
> > <copyField source="description" dest="default_field />
> >
> > New way -- schema2:
> > =======
> > <field name="description type="text_general" indexed="true"
> > multiValued="false"/>
>
> If the only field in the indexed documents is "description", the index
> built with schema2 will be half the size of the index built with
> schema1.  Both fields referenced by "copyField" are the same type and
> have the same settings, so they would contain exactly the same data at
> the Lucene level.
>
> Having the same type for a source and destination field is normally only
> useful if multiple sources are copied to a destination, which requires
> multiValued="true" on the destination -- NOT the case in your example.
>
> There is one other use case for a copyField -- using the same data
> differently, with different type values.  For example you might have one
> type for faceting and one for searching.
>
> Thanks,
> Shawn
>

Reply via email to