[ https://issues.apache.org/jira/browse/SOLR-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774195#comment-17774195 ]
rajanimaski edited comment on SOLR-17018 at 10/11/23 6:48 PM: -------------------------------------------------------------- ok figured it is lengthy queries "q" having many query terms. LTR feature tries to compute tfidf of such a query and reports "The request took too long to iterate over terms" in log file and on heap, the query parser tries to parse the query fills it up causing full heap use and following by oom. {noformat} org.apache.solr.search.ReRankCollector @ 0x699161e600.00 MB70.81 MB1.16%org.apache.lucene.search.BooleanQuery @ 0x6991798000.00 MB0.03 MB0.00%java.lang.ThreadLocal$ThreadLocalMap @ 0x69440b3b00.00 MB0.03 MB0.00%org.apache.lucene.search.BooleanQuery @ 0x6b50f73f00.00 MB0.03 MB0.00%org.apache.lucene.queries.function.FunctionScoreQuery @ 0x699167f180.00 MB0.02 MB0.00%com.shutterstock.solr.parser.expression.Expression @ 0x699161c200.00 MB0.01 MB0.00%org.apache.solr.response.SolrQueryResponse @ 0x6991620780.00 MB0.00 MB0.00%java.util.ArrayList @ 0x6991882b00.00 MB0.00 MB0.00%org.apache.solr.handler.component.ResponseBuilder @ 0x699161fa80.00 MB0.00 MB0.00%java.lang.String @ 0x6b50fe8d0 {!edismax qf=description.en pf2=description.en}((bang) OR (blast) OR (blow AND up) OR (breakout) OR (burst) OR (explosion) OR (gale) OR (outbreak) OR (outburst) OR (esplosione)) AND ((dust) OR (filing) OR (powder) OR (polvere)) AND ((blanche) OR (white) OR...0.00 MB0.00 MB0.00%java.lang.String @ 0x6b50fe3b0 ((bang) OR (blast) OR (blow AND up) OR (breakout) OR (burst) OR (explosion) OR (gale) OR (outbreak) OR (outburst) OR (esplosione)) AND ((dust) OR (filing) OR (powder) OR (polvere)) AND ((blanche) OR (white) OR (white AND woman) OR (bianca)) AND ((confine) ...0.00 MB0.00 MB0.00%java.util.ArrayList @ 0x699161d780.00 MB0.00 MB0.00%org.apache.solr.search.ExtendedDismaxQParser$ExtendedDismaxConfiguration @ 0x699161ca00.00 MB0.00 MB0.00%{noformat} example query and feature defintion "\{!edismax qf=description.en pf2=description.en}" {noformat} q=((bang) OR (blast) OR (blow AND up) OR (breakout) OR (burst) OR (explosion) OR (gale) OR (outbreak) OR (outburst) OR (esplosione)) AND ((dust) OR (filing) OR (powder) OR (polvere)) AND ((blanche) OR (white) OR (white AND woman) OR (bianca)) AND ((confine) OR (cut AND off) OR (insular) OR (insulate) OR (insulated) OR (insulating) OR (insulation) OR (insulator) OR (insulators) OR (island) OR (islander) OR (islands) OR (isle) OR (isolable) OR (isolatable) OR (isolate) OR (isolated) OR (isolates) OR (isolation) OR (isolations) OR (isolete) OR (lonely) OR (remote) OR (seal) {noformat} Handling such queries at client level should prevent it from happening again. Thinking if this should be handled on solr side as well. This literally takes over full heap, causes high gc activity (filling up tenured space), high cpu uage and does not stop until query is complete, hence bringing solr nodes down. was (Author: rajanimaski): ok figured it is lengthy queries "q" having many query terms. LTR feature tries to compute tfidf of such a query and reports "The request took too long to iterate over terms" in log file and on heap, the query parser tries to parse the query fills it up causing full heap use and following by oom. {noformat} org.apache.solr.search.ReRankCollector @ 0x699161e600.00 MB70.81 MB1.16%org.apache.lucene.search.BooleanQuery @ 0x6991798000.00 MB0.03 MB0.00%java.lang.ThreadLocal$ThreadLocalMap @ 0x69440b3b00.00 MB0.03 MB0.00%org.apache.lucene.search.BooleanQuery @ 0x6b50f73f00.00 MB0.03 MB0.00%org.apache.lucene.queries.function.FunctionScoreQuery @ 0x699167f180.00 MB0.02 MB0.00%com.shutterstock.solr.parser.expression.Expression @ 0x699161c200.00 MB0.01 MB0.00%org.apache.solr.response.SolrQueryResponse @ 0x6991620780.00 MB0.00 MB0.00%java.util.ArrayList @ 0x6991882b00.00 MB0.00 MB0.00%org.apache.solr.handler.component.ResponseBuilder @ 0x699161fa80.00 MB0.00 MB0.00%java.lang.String @ 0x6b50fe8d0 {!edismax qf=description.en pf2=description.en}((bang) OR (blast) OR (blow AND up) OR (breakout) OR (burst) OR (explosion) OR (gale) OR (outbreak) OR (outburst) OR (esplosione)) AND ((dust) OR (filing) OR (powder) OR (polvere)) AND ((blanche) OR (white) OR...0.00 MB0.00 MB0.00%java.lang.String @ 0x6b50fe3b0 ((bang) OR (blast) OR (blow AND up) OR (breakout) OR (burst) OR (explosion) OR (gale) OR (outbreak) OR (outburst) OR (esplosione)) AND ((dust) OR (filing) OR (powder) OR (polvere)) AND ((blanche) OR (white) OR (white AND woman) OR (bianca)) AND ((confine) ...0.00 MB0.00 MB0.00%java.util.ArrayList @ 0x699161d780.00 MB0.00 MB0.00%org.apache.solr.search.ExtendedDismaxQParser$ExtendedDismaxConfiguration @ 0x699161ca00.00 MB0.00 MB0.00%{noformat} example query and feature defintion "\{!edismax qf=description.en pf2=description.en}" {noformat} q=((bang) OR (blast) OR (blow AND up) OR (breakout) OR (burst) OR (explosion) OR (gale) OR (outbreak) OR (outburst) OR (esplosione)) AND ((dust) OR (filing) OR (powder) OR (polvere)) AND ((blanche) OR (white) OR (white AND woman) OR (bianca)) AND ((confine) OR (cut AND off) OR (insular) OR (insulate) OR (insulated) OR (insulating) OR (insulation) OR (insulator) OR (insulators) OR (island) OR (islander) OR (islands) OR (isle) OR (isolable) OR (isolatable) OR (isolate) OR (isolated) OR (isolates) OR (isolation) OR (isolations) OR (isolete) OR (lonely) OR (remote) OR (seal) {noformat} Handling such queries at client level should prevent it from happening again. Thinking if this should be handled in solr as it takes over full heap, causes high gc activity (filling up tenured space), high cpu uage and does not stop until query is complete, hence bringing solr down. > LTR queries producing out of memory issues > ------------------------------------------ > > Key: SOLR-17018 > URL: https://issues.apache.org/jira/browse/SOLR-17018 > Project: Solr > Issue Type: Improvement > Components: contrib - LTR, ltr > Affects Versions: 9.1.1 > Reporter: rajanimaski > Priority: Major > Labels: ltr > Attachments: image-2023-10-11-14-38-43-351.png > > > LTR queries are producing oom and looks like the issue is same as what is > described on this SOLR-5986. I think this happens because LTR component does > not implement `timeAllowed`. During peak traffic hours, when 80% of resources > are in use, one expensive LTR query that either has more terms or paginated > with start>=500 are not ceased and hence leading to full heap usage, invoking > full GC, and reporting an oom. I do see queries with ltr param are taking > more than 15 seconds which seems like is the reason for oom issue, it is > during this time interval the tenured(old generation) heap also gets filled > up. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org