Okay thanks for the tip. I am pretty wary of streaming logs into my main set of documents + tons of $stat_updated_at fields + resetting stats on ~every document every day + whatever else we feel like trending. It just feels like a lot of churn.
I will lean towards the !join on stats-$DATE probably. On Tue, Sep 1, 2020 at 11:32 AM Erick Erickson <erickerick...@gmail.com> wrote: > > I wouldn’t use ExternalFileField if your use-case is served by in-place > updates. See > > https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html#in-place-updates > > EFFs were put in in order to have _some_ capability to change individual > fields in a doc > long before in-place updates were around and long before SolrCloud. Using EFF > in any > kind of sharded system will cause you significant heartburn in terms of > keeping the > file up to date on all replicas. > > Best, > Erick > > > On Sep 1, 2020, at 11:21 AM, matthew sporleder <msporle...@gmail.com> wrote: > > > > We are researching the canonical use case for external fields -- > > traffic-based rankings > > > > What are the practical limits on the size of the external field file? > > A k=v text file seems like it might fall over if it grows into the GB > > range? > > > > Our other thought is to use rolling cores where we stream in web logs > > and use !join queries. > > > > Does anyone have practical experience with this that they might want to > > share? > > > > Thanks, > > Matt >