Github user dsmiley commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/433#discussion_r208446107
  
    --- Diff: 
solr/core/src/java/org/apache/solr/update/processor/TimeRoutedAliasUpdateProcessor.java
 ---
    @@ -167,66 +173,121 @@ private String getAliasName() {
       public void processAdd(AddUpdateCommand cmd) throws IOException {
         SolrInputDocument solrInputDocument = cmd.getSolrInputDocument();
         final Object routeValue = 
solrInputDocument.getFieldValue(timeRoutedAlias.getRouteField());
    -    final Instant routeTimestamp = parseRouteKey(routeValue);
    -
    +    final Instant docTimestampToRoute = parseRouteKey(routeValue);
         updateParsedCollectionAliases();
    -    String targetCollection;
    -    do { // typically we don't loop; it's only when we need to create a 
collection
    -      targetCollection = 
findTargetCollectionGivenTimestamp(routeTimestamp);
    +    String candidateCollection = 
findCandidateCollectionGivenTimestamp(docTimestampToRoute, 
cmd.getPrintableId());
    +    String targetCollection = 
createCollectionsIfRequired(docTimestampToRoute, candidateCollection, 
cmd.getPrintableId());
    +    if (thisCollection.equals(targetCollection)) {
    +      // pass on through; we've reached the right collection
    +      super.processAdd(cmd);
    +    } else {
    +      // send to the right collection
    +      SolrCmdDistributor.Node targetLeaderNode = 
routeDocToSlice(targetCollection, solrInputDocument);
    +      cmdDistrib.distribAdd(cmd, 
Collections.singletonList(targetLeaderNode), new 
ModifiableSolrParams(outParamsToLeader));
    +    }
    +  }
     
    -      if (targetCollection == null) {
    -        throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
    -            "Doc " + cmd.getPrintableId() + " couldn't be routed with " + 
timeRoutedAlias.getRouteField() + "=" + routeTimestamp);
    +  private String createCollectionsIfRequired(Instant docTimestamp, String 
targetCollection, String printableId) {
    +    try {
    +      CreationType creationType = requiresCreateCollection(docTimestamp, 
timeRoutedAlias.getPreemptiveCreateWindow());
    +      switch (creationType) {
    +        case SYNCHRONOUS:
    +          // This next line blocks until all collections required by the 
current document have been created
    +          return maintain(targetCollection, docTimestamp, printableId);
    +        case ASYNC_PREEMPTIVE:
    +          // Note: creating an executor is slightly expensive, but only 
likely to happen once per hour/day/week
    +          // (depending on time slice size for the TRA). Executor is used 
to ensure we pick up the MDC logging stuff
    +          // from ExecutorUtil. Even though it is possible that multiple 
requests hit this code in the 1-2 sec that
    +          // it takes to create a collection, it's an established 
anti-pattern to feed data with a very large number
    +          // of client connections. This in mind, we only guard against 
spamming the overseer within a batch of
    +          // updates, intentionally tolerating a low level of redundant 
requests in favor of simpler code. Most
    +          // super-sized installations with many update clients will 
likely be multi-tenant and multiple tenants
    +          // probably don't write to the same alias. As such, we have 
deferred any solution the "many clients causing
    +          // collection creation simultaneously" problem until such time 
as someone actually has that problem in a
    +          // real world use case that isn't just an anti-pattern.
    +          synchronized (execLock) {
    +            if (preemptiveCreationExecutor == null) {
    +              preemptiveCreationExecutor = preemptiveCreationExecutor();
    +            }
    +            preemptiveCreationExecutor.execute(() -> {
    +              maintain(targetCollection, docTimestamp, printableId);
    +              preemptiveCreationExecutor = null;
    +            });
    +            preemptiveCreationExecutor.shutdown(); // shutdown immediately 
to ensure no new requests accepted
    --- End diff --
    
    Hmmm; I think we can shut down in finish() if it's non-null.  Wouldn't we 
want to keep this active for hypothetical more pre-emptive creation during the 
lifespan of this URP instance (however extremely unlikely that is I totally 
grant you).  I think it'd be easier to reason about.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to