[
https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576762#comment-16576762
]
Christine Poerschke commented on SOLR-12590:
--------------------------------------------
bq. ... Do you have the bandwidth to test this assertion? ...
Hmm, ok, so i've explored reaching the {{// delegate to the class loader
(looking into $INSTANCE_DIR/lib jars)}} code path in
[ZkSolrResourceLoader|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/core/src/java/org/apache/solr/cloud/ZkSolrResourceLoader.java#L122]
for large learning-to-rank models, and, well, here's just some notes from that
really:
* We have a
[ManagedModelStore|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedModelStore.java]
using {{"/schema/model-store"}} as the REST endpoint. Conceptually, if the
model store itself wasn't there (in ZooKeeper) then in principle looking
elsewhere locally might be an option; having said that:
** if there is a (small) model store then perhaps one would wish to keep that
and any alternative additional (large) model store should be separate.
** {{SolrResourceLoader}} has a {{managedResourceRegistry}} but it's not
immediately obvious from a quick look if {{ZkSolrResourceLoader}} (or something
else) has an equivalent which would look locally if it's not there in ZooKeeper.
* Models use features and we have a
[ManagedFeatureStore|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedFeatureStore.java]
using {{"/schema/feature-store"}} as the REST endpoint.
** If there was a concept of a (small/regular) model store in ZooKeeper and an
(additional/larger) model store locally, then similarly an additional large
feature store locally might be logical.
** In such a hypothetical scenario, could models in the large model store use
feature from the small feature store, and vice versa? What if both places have
models with the same name?
** Current code detail: features are conceptually organised into [feature
stores|https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html#feature-stores]
akin to namespaces
but in terms of implementation they are all persisted in the same place i.e.
{{_schema_feature-store.json}} matching the {{"/schema/feature-store"}} upload
REST endpoint.
So from this exploration I think the wrapper model concept introduced in
SOLR-11250 is currently the only way to support large models (without changing
ZooKeeper's max file size limit).
> Improve Solr resource loader coverage in the ref guide
> ------------------------------------------------------
>
> Key: SOLR-12590
> URL: https://issues.apache.org/jira/browse/SOLR-12590
> Project: Solr
> Issue Type: Task
> Security Level: Public(Default Security Level. Issues are Public)
> Components: documentation
> Reporter: Steve Rowe
> Assignee: Steve Rowe
> Priority: Major
> Attachments: SOLR-12590.patch
>
>
> In SolrCloud, storing large resources (e.g. binary machine learned models) on
> the local filesystem should be a viable alternative to increasing ZooKeeper's
> max file size limit (1MB), but there are undocumented complications.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]