Github user dsmiley commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/422#discussion_r204418038
--- Diff:
solr/core/src/java/org/apache/solr/update/processor/TimeRoutedAliasUpdateProcessor.java
---
@@ -170,55 +179,35 @@ public void processAdd(AddUpdateCommand cmd) throws
IOException {
final Instant routeTimestamp = parseRouteKey(routeValue);
updateParsedCollectionAliases();
- String targetCollection;
- do { // typically we don't loop; it's only when we need to create a
collection
- targetCollection =
findTargetCollectionGivenTimestamp(routeTimestamp);
-
- if (targetCollection == null) {
- throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
- "Doc " + cmd.getPrintableId() + " couldn't be routed with " +
timeRoutedAlias.getRouteField() + "=" + routeTimestamp);
- }
-
- // Note: the following rule is tempting but not necessary and is not
compatible with
- // only using this URP when the alias distrib phase is NONE;
otherwise a doc may be routed to from a non-recent
- // collection to the most recent only to then go there directly
instead of realizing a new collection is needed.
- // // If it's going to some other collection (not "this") then
break to just send it there
- // if (!thisCollection.equals(targetCollection)) {
- // break;
- // }
- // Also tempting but not compatible: check that we're the leader,
if not then break
-
- // If the doc goes to the most recent collection then do some checks
below, otherwise break the loop.
- final Instant mostRecentCollTimestamp =
parsedCollectionsDesc.get(0).getKey();
- final String mostRecentCollName =
parsedCollectionsDesc.get(0).getValue();
- if (!mostRecentCollName.equals(targetCollection)) {
- break;
- }
-
- // Check the doc isn't too far in the future
- final Instant maxFutureTime =
Instant.now().plusMillis(timeRoutedAlias.getMaxFutureMs());
- if (routeTimestamp.isAfter(maxFutureTime)) {
- throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
- "The document's time routed key of " + routeValue + " is too
far in the future given " +
- TimeRoutedAlias.ROUTER_MAX_FUTURE + "=" +
timeRoutedAlias.getMaxFutureMs());
- }
-
- // Create a new collection?
- final Instant nextCollTimestamp =
timeRoutedAlias.computeNextCollTimestamp(mostRecentCollTimestamp);
- if (routeTimestamp.isBefore(nextCollTimestamp)) {
- break; // thus we don't need another collection
- }
-
- createCollectionAfter(mostRecentCollName); // *should* throw if
fails for some reason but...
- final boolean updated = updateParsedCollectionAliases();
- if (!updated) { // thus we didn't make progress...
- // this is not expected, even in known failure cases, but we check
just in case
- throw new SolrException(SolrException.ErrorCode.SERVER_ERROR,
- "We need to create a new time routed collection but for
unknown reasons were unable to do so.");
+ String targetCollection =
findCandidateCollectionGivenTimestamp(routeTimestamp, cmd.getPrintableId());
+ try {
+ String preemptiveCreateWindow =
timeRoutedAlias.getPreemptiveCreateWindow();
+ if (StringUtils.isBlank(preemptiveCreateWindow) ||
// default: no pre-create
+ requiresCreateCollection(routeTimestamp, null)) { // or
no collection exists for doc
+ targetCollection = new Maintainer(routeTimestamp,
cmd.getPrintableId()).maintain(targetCollection); // then
do it synchronously.
+ } else if (requiresCreateCollection(routeTimestamp,
preemptiveCreateWindow )) { // else async...
+ String finalTargetCollection = targetCollection;
+ try {
+
asyncMaintainers.computeIfAbsent(timeRoutedAlias.getAliasName(),(alias) ->
+ singleThreadSingleTaskExecutor()).submit(() ->
+ new Maintainer(routeTimestamp,
cmd.getPrintableId()).maintain(finalTargetCollection));
+ } catch (RejectedExecutionException e) {
+ // Note: There are some esoteric cases where the pre-create
interval is a lot longer than
+ // a single time slice in the alias, and a doc that should cause
several creations is blocked by
+ // one that only creates a single collection, but that requires
some sort of long pause followed by a 2 doc
+ // burst and even then it only causes a synch create if there's
a subsequent pause of more than another
+ // full time slice... These sorts of cases are probably more
work than their value. If someone actually
+ // hits those cases frequently enough to notice, we can think
about adding some further logic. For now,
+ // this is complicated enough as it is.
+ log.trace("Collection creation rejected, probably some other
request has" +
--- End diff --
I'm not yet sure what to think of the scenario above, but note the code
that existed before had an outer loop and that helped with edge cases by
retrying.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]