Thank you all for your feedback.  They are very helpful.

@Walther, out of the 1000 fields in Solr's schema, only 5 are set as
"required" fields and the Solr doc that I create and then send to Solr for
indexing, contains only those fields that have data to be indexed.  So some
docs will have 10 fields, some 50, etc.

Steven

On Thu, Sep 17, 2020 at 1:55 PM Erick Erickson <erickerick...@gmail.com>
wrote:

> The script can actually be written an any number of scripting languages,
> python, groovy,
> javascript etc. but Alexandre’s comments about javascript are well taken.
>
> It all depends here on whether you every want to search the fields
> individually. If you do,
> you need to have them in your index as well as the copyField.
>
> > On Sep 17, 2020, at 1:37 PM, Walter Underwood <wun...@wunderwood.org>
> wrote:
> >
> > If you want to ignore a field being sent to Solr, you can set
> indexed=false and
> > stored=false for that field in schema.xml. It will take up room in
> schema.xml but
> > zero room on disk.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >> On Sep 17, 2020, at 10:23 AM, Alexandre Rafalovitch <arafa...@gmail.com>
> wrote:
> >>
> >> Solr has a whole pipeline that you can run during document ingesting
> before
> >> the actual indexing happens. It is called Update Request Processor (URP)
> >> and is defined in solrconfig.xml or in an override file. Obviously,
> since
> >> you are indexing from SolrJ client, you have even more flexibility, but
> it
> >> is good to know about anyway.
> >>
> >> You can read all about it at:
> >> https://lucene.apache.org/solr/guide/8_6/update-request-processors.html
> and
> >> see the extensive list of processors you can leverage. The specific
> >> mentioned one is this one:
> >>
> https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html
> >>
> >> Just a word of warning that Stateless URP is using Javascript, which is
> >> getting a bit of a complicated story as underlying JVM is upgraded
> (Oracle
> >> dropped their javascript engine in JDK 14). So if one of the simpler
> URPs
> >> will do the job or a chain of them, that may be a better path to take.
> >>
> >> Regards,
> >>  Alex.
> >>
> >>
> >> On Thu, 17 Sep 2020 at 13:13, Steven White <swhite4...@gmail.com>
> wrote:
> >>
> >>> Thanks Erick.  Where can I learn more about "stateless script update
> >>> processor factory".  I don't know what you mean by this.
> >>>
> >>> Steven
> >>>
> >>> On Thu, Sep 17, 2020 at 1:08 PM Erick Erickson <
> erickerick...@gmail.com>
> >>> wrote:
> >>>
> >>>> 1000 fields is fine, you'll waste some cycles on bookkeeping, but I
> >>> really
> >>>> doubt you'll notice. That said, are these fields used for searching?
> >>>> Because you do have control over what gous into the index if you can
> put
> >>> a
> >>>> "stateless script update processor factory" in your update chain.
> There
> >>> you
> >>>> can do whatever you want, including combine all the fields into one
> and
> >>>> delete the original fields. There's no point in having your index
> >>> cluttered
> >>>> with unused fields, OTOH, it may not be worth the effort just to
> satisfy
> >>> my
> >>>> sense of aesthetics 😉
> >>>>
> >>>> On Thu, Sep 17, 2020, 12:59 Steven White <swhite4...@gmail.com>
> wrote:
> >>>>
> >>>>> Hi Eric,
> >>>>>
> >>>>> Yes, this is coming from a DB.  Unfortunately I have no control over
> >>> the
> >>>>> list of fields.  Out of the 1000 fields that there maybe, no
> document,
> >>>> that
> >>>>> gets indexed into Solr will use more then about 50 and since i'm
> >>> copying
> >>>>> the values of those fields to the catch-all field and the catch-all
> >>> field
> >>>>> is my default search field, I don't expect any problem for having
> 1000
> >>>>> fields in Solr's schema, or should I?
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>> Steven
> >>>>>
> >>>>>
> >>>>> On Thu, Sep 17, 2020 at 8:23 AM Erick Erickson <
> >>> erickerick...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> “there over 1000 of them[fields]”
> >>>>>>
> >>>>>> This is often a red flag in my experience. Solr will handle that
> many
> >>>>>> fields, I’ve seen many more. But this is often a result of
> >>>>>> “database thinking”, i.e. your mental model of how all this data
> >>>>>> is from a DB perspective rather than a search perspective.
> >>>>>>
> >>>>>> It’s unwieldy to have that many fields. Obviously I don’t know the
> >>>>>> particulars of
> >>>>>> your app, and maybe that’s the best design. Particularly if many of
> >>> the
> >>>>>> fields
> >>>>>> are sparsely populated, i.e. only a small percentage of the
> documents
> >>>> in
> >>>>>> your
> >>>>>> corpus have any value for that field then taking a step back and
> >>>> looking
> >>>>>> at the design might save you some grief down the line.
> >>>>>>
> >>>>>> For instance, I’ve seen designs where instead of
> >>>>>> field1:some_value
> >>>>>> field2:other_value….
> >>>>>>
> >>>>>> you use a single field with _tokens_ like:
> >>>>>> field:field1_some_value
> >>>>>> field:field2_other_value
> >>>>>>
> >>>>>> that drops the complexity and increases performance.
> >>>>>>
> >>>>>> Anyway, just a thought you might want to consider.
> >>>>>>
> >>>>>> Best,
> >>>>>> Erick
> >>>>>>
> >>>>>>> On Sep 16, 2020, at 9:31 PM, Steven White <swhite4...@gmail.com>
> >>>>> wrote:
> >>>>>>>
> >>>>>>> Hi everyone,
> >>>>>>>
> >>>>>>> I figured it out.  It is as simple as creating a List<String> and
> >>>> using
> >>>>>>> that as the value part for SolrInputDocument.addField() API.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>> Steven
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Sep 16, 2020 at 9:13 PM Steven White <swhite4...@gmail.com
> >>>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi everyone,
> >>>>>>>>
> >>>>>>>> I want to avoid creating a <copyField dest="CatchAll"
> >>>>>>>> source="OneFieldOfMany"/> in my schema (there will be over 1000 of
> >>>>> them
> >>>>>> and
> >>>>>>>> maybe more so managing it will be a pain).  Instead, I want to use
> >>>>> SolrJ
> >>>>>>>> API to do what <copyField/> does.  Any example of how I can do
> >>> this?
> >>>>> If
> >>>>>>>> there is an example online, that would be great.
> >>>>>>>>
> >>>>>>>> Thanks in advance.
> >>>>>>>>
> >>>>>>>> Steven
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >
>
>

Reply via email to