Bukhtawar opened a new issue, #12023: URL: https://github.com/apache/lucene/issues/12023
### Description As a part of https://github.com/opensearch-project/OpenSearch/issues/687 we detected that regex queries can run into tight loops for quite long. Below is the stack trace of the request for a wildcard query which consumed 100% CPU for an hour(although addressed in https://issues.apache.org/jira/browse/LUCENE-9981). OpenSearch has a mechanism to cancel task on timeout and one other option that was we were deliberating was to send interrupts to the long running thread if the request is consuming more Memory/CPU or time, independent of query timeout. The problem is not all costly executions in Lucene have a check on interrupts much like ExitableDirectoryReader#checkAndThrow ``` private void checkAndThrow() { if (queryTimeout.shouldExit()) { throw new ExitingReaderException("The request took too long to iterate over point values. Timeout: " + queryTimeout.toString() + ", PointValues=" + in ); } else if (Thread.interrupted()) { throw new ExitingReaderException("Interrupted while iterating over point values. PointValues=" + in); } } ``` Sharing the stack trace for the request ``` 100.2% (500.8ms out of 500ms) cpu usage by thread 'opensearch[917451917ca5731579187db45dd52853][search][T#5]' 4/10 snapshots sharing following 36 elements app//org.apache.lucene.util.automaton.Operations.determinize(Operations.java:780) app//org.apache.lucene.util.automaton.Operations.getCommonSuffixBytesRef(Operations.java:1155) app//org.apache.lucene.util.automaton.CompiledAutomaton.<init>(CompiledAutomaton.java:245) app//org.apache.lucene.search.AutomatonQuery.<init>(AutomatonQuery.java:110) app//org.apache.lucene.search.AutomatonQuery.<init>(AutomatonQuery.java:87) app//org.apache.lucene.search.AutomatonQuery.<init>(AutomatonQuery.java:71) app//org.apache.lucene.search.WildcardQuery.<init>(WildcardQuery.java:56) app//org.opensearch.index.mapper.StringFieldType.wildcardQuery(StringFieldType.java:158) app//org.opensearch.index.query.WildcardQueryBuilder.doToQuery(WildcardQueryBuilder.java:259) app//org.opensearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:116) app//org.opensearch.index.query.BoolQueryBuilder.addBooleanClauses(BoolQueryBuilder.java:337) app//org.opensearch.index.query.BoolQueryBuilder.doToQuery(BoolQueryBuilder.java:321) app//org.opensearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:116) app//org.opensearch.index.query.QueryShardContext.lambda$toQuery$3(QueryShardContext.java:386) app//org.opensearch.index.query.QueryShardContext$$Lambda$5010/0x0000000801d4d840.apply(Unknown Source) app//org.opensearch.index.query.QueryShardContext.toQuery(QueryShardContext.java:398) app//org.opensearch.index.query.QueryShardContext.toQuery(QueryShardContext.java:385) app//org.opensearch.search.SearchService.parseSource(SearchService.java:903) app//org.opensearch.search.SearchService.createContext(SearchService.java:740) app//org.opensearch.search.SearchService.executeQueryPhase(SearchService.java:442) app//org.opensearch.search.SearchService.access$500(SearchService.java:155) app//org.opensearch.search.SearchService$2.lambda$onResponse$0(SearchService.java:415) app//org.opensearch.search.SearchService$2$$Lambda$4889/0x0000000801ce9840.get(Unknown Source) app//org.opensearch.search.SearchService$$Lambda$4891/0x0000000801ce9c40.get(Unknown Source) app//org.opensearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:71) app//org.opensearch.action.ActionRunnable$$Lambda$3741/0x00000008011fa440.accept(Unknown Source) app//org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:86) app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) app//org.opensearch.threadpool.TaskAwareRunnable.doRun(TaskAwareRunnable.java:78) app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) app//org.opensearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:57) app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:774) app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) java.base@11.0.16/java.lang.Thread.run(Thread.java:829) ``` Wanted to gather thoughts on exposing query cancellation controls like interrupts in Lucene for some of the expensive queries -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org