[jira] [Comment Edited] (SOLR-14201) some SolrCore are not released after being removed
[ https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020732#comment-17020732 ] Vinh Le edited comment on SOLR-14201 at 1/28/20 8:03 AM: - Thanks [~cpoerschke] To reproduce this issue, just keep creating new collections {code:java} while true; do ./import.sh; sleep 10; done #import.sh #!/bin/bash -e HOST=http://localhost:8983/solr PREV_COLLECTION=$(http "$HOST/admin/collections?action=LISTALIASES" | jq -r ".aliases.SGFAS") COLLECTION="next_$(gdate +%H%M%S)" echo "Create new collection = $COLLECTION" http POST "$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1" echo "Push data to new collection" cat docs.xml | http POST "$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" "Content-Type: text/xml" echo "Optimize" http "$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false" echo "Update alias" http "$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS" echo "Delete previous collection = $PREV_COLLECTION" http "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION" {code} I also tried to remove all plugins, but the issue still persists. Classes.loaded keeps increasing. {code:java} ❯ http "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"' 8428 ❯ http "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"' 9323 {code} !image-2020-01-22-10-39-15-301.png|width=759,height=606! !image-2020-01-22-10-42-17-511.png! !image-2020-01-22-12-28-46-241.png! And VisualVM graphs !image-2020-01-22-14-45-52-730.png|width=966,height=677! I'm not really familiar with Java, but looks like this is related to finalizers. was (Author: vinhlh): Thanks [~cpoerschke] To reproduce this issue, just keep creating new collections {code:java} while true; do ./import.sh; sleep 10; done #import.sh #!/bin/bash -e HOST=http://localhost:8983/solr PREV_COLLECTION=$(http "$HOST/admin/collections?action=LISTALIASES" | jq -r ".aliases.SGFAS") COLLECTION="next_$(gdate +%H%M%S)" # COLLECTION="next_1029" echo "Create new collection = $COLLECTION" http POST "$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1" echo "Push data to new collection" cat docs.xml | http POST "$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" "Content-Type: text/xml" echo "Optimize" http "$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false" echo "Update alias" http "$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS" echo "Delete previous collection = $PREV_COLLECTION" http "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION" {code} I also tried to remove all plugins, but the issue still persists. Classes.loaded keeps increasing. {code:java} ❯ http "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"' 8428 ❯ http "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"' 9323 {code} !image-2020-01-22-10-39-15-301.png|width=759,height=606! !image-2020-01-22-10-42-17-511.png! !image-2020-01-22-12-28-46-241.png! And VisualVM graphs !image-2020-01-22-14-45-52-730.png|width=966,height=677! I'm not really familiar with Java, but looks like this is related to finalizers. > some SolrCore are not released after being removed > -- > > Key: SOLR-14201 > URL: https://issues.apache.org/jira/browse/SOLR-14201 > Project: Solr > Issue Type: Bug >Reporter: Christine Poerschke >Priority: Major > Attachments: image-2020-01-22-10-39-15-301.png, > image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, > image-2020-01-22-14-45-52-730.png > > > [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0): > bq. In 7.7.2, some SolrCore still are not released after being removed. > https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png > Starting this ticket for a separate investigation and fix. A next > investigative step could be to try and reproduce the issue on the latest 8.x > release. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14201) some SolrCore are not released after being removed
[ https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020732#comment-17020732 ] Vinh Le edited comment on SOLR-14201 at 1/28/20 8:04 AM: - Thanks [~cpoerschke] To reproduce this issue, just keep creating new collections {code:java} while true; do ./import.sh; sleep 10; done #import.sh #!/bin/bash -e HOST=http://localhost:8983/solr PREV_COLLECTION=$(http "$HOST/admin/collections?action=LISTALIASES" | jq -r ".aliases.SGFAS") COLLECTION="next_$(gdate +%H%M%S)" echo "Create new collection = $COLLECTION" http POST "$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1" echo "Push data to new collection" cat docs.xml | http POST "$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" "Content-Type: text/xml" echo "Optimize" http "$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false" echo "Update alias" http "$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS" echo "Delete previous collection = $PREV_COLLECTION" http "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION" {code} I also tried to remove all plugins, but the issue still persists. Classes.loaded keeps increasing. {code:java} ❯ http "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"' 8428 ❯ http "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"' 9323 {code} !image-2020-01-22-10-39-15-301.png|width=759,height=606! !image-2020-01-22-10-42-17-511.png! !image-2020-01-22-12-28-46-241.png! And VisualVM graphs !image-2020-01-22-14-45-52-730.png|width=966,height=677! I'm not really familiar with Java, but looks like this is related to finalizers. was (Author: vinhlh): Thanks [~cpoerschke] To reproduce this issue, just keep creating new collections {code:java} while true; do ./import.sh; sleep 10; done #import.sh #!/bin/bash -e HOST=http://localhost:8983/solr PREV_COLLECTION=$(http "$HOST/admin/collections?action=LISTALIASES" | jq -r ".aliases.SGFAS") COLLECTION="next_$(gdate +%H%M%S)" echo "Create new collection = $COLLECTION" http POST "$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1" echo "Push data to new collection" cat docs.xml | http POST "$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" "Content-Type: text/xml" echo "Optimize" http "$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false" echo "Update alias" http "$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS" echo "Delete previous collection = $PREV_COLLECTION" http "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION" {code} I also tried to remove all plugins, but the issue still persists. Classes.loaded keeps increasing. {code:java} ❯ http "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"' 8428 ❯ http "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"' 9323 {code} !image-2020-01-22-10-39-15-301.png|width=759,height=606! !image-2020-01-22-10-42-17-511.png! !image-2020-01-22-12-28-46-241.png! And VisualVM graphs !image-2020-01-22-14-45-52-730.png|width=966,height=677! I'm not really familiar with Java, but looks like this is related to finalizers. > some SolrCore are not released after being removed > -- > > Key: SOLR-14201 > URL: https://issues.apache.org/jira/browse/SOLR-14201 > Project: Solr > Issue Type: Bug >Reporter: Christine Poerschke >Priority: Major > Attachments: image-2020-01-22-10-39-15-301.png, > image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, > image-2020-01-22-14-45-52-730.png > > > [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0): > bq. In 7.7.2, some SolrCore still are not released after being removed. > https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png > Starting this ticket for a separate investigation and fix. A next > investigative step could be to try and reproduce the issue on the latest 8.x > release. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1217: SOLR-14223 PublicKeyHandler consumes a lot of entropy during tests
dweiss commented on a change in pull request #1217: SOLR-14223 PublicKeyHandler consumes a lot of entropy during tests URL: https://github.com/apache/lucene-solr/pull/1217#discussion_r371655634 ## File path: solr/test-framework/src/java/org/apache/solr/util/NotSecurePsuedoRandom.java ## @@ -0,0 +1,73 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.util; + +import java.security.SecureRandom; +import java.security.SecureRandomSpi; +import java.util.Random; + +/** + * A mocked up instance of SecureRandom that just uses {@link Random} under the covers. + * This is to prevent blocking issues that arise in platform default + * SecureRandom instances due to too many instances / not enough random entropy. + * Tests do not need secure SSL. + */ +public class NotSecurePsuedoRandom extends SecureRandom { + public static final SecureRandom INSTANCE = new NotSecurePsuedoRandom(); + private static final Random RAND = new Random(42); Review comment: This must *not* be static or shared across test instances. A better solution would be to create this off an initial long seed and this seed should be taken from RandomizedContext.current..().random().nextLong(). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14201) some SolrCore are not released after being removed
[ https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinh Le updated SOLR-14201: --- Attachment: image-2020-01-28-16-17-44-030.png > some SolrCore are not released after being removed > -- > > Key: SOLR-14201 > URL: https://issues.apache.org/jira/browse/SOLR-14201 > Project: Solr > Issue Type: Bug >Reporter: Christine Poerschke >Priority: Major > Attachments: image-2020-01-22-10-39-15-301.png, > image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, > image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png > > > [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0): > bq. In 7.7.2, some SolrCore still are not released after being removed. > https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png > Starting this ticket for a separate investigation and fix. A next > investigative step could be to try and reproduce the issue on the latest 8.x > release. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14201) some SolrCore are not released after being removed
[ https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinh Le updated SOLR-14201: --- Attachment: image-2020-01-28-16-19-43-760.png > some SolrCore are not released after being removed > -- > > Key: SOLR-14201 > URL: https://issues.apache.org/jira/browse/SOLR-14201 > Project: Solr > Issue Type: Bug >Reporter: Christine Poerschke >Priority: Major > Attachments: image-2020-01-22-10-39-15-301.png, > image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, > image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, > image-2020-01-28-16-19-43-760.png > > > [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0): > bq. In 7.7.2, some SolrCore still are not released after being removed. > https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png > Starting this ticket for a separate investigation and fix. A next > investigative step could be to try and reproduce the issue on the latest 8.x > release. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1218: Javacc erick
dweiss commented on a change in pull request #1218: Javacc erick URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371656425 ## File path: gradle/defaults-java.gradle ## @@ -25,13 +25,13 @@ allprojects { tasks.withType(JavaCompile) { options.encoding = "UTF-8" options.compilerArgs += [ -"-Xlint", Review comment: Are these differences on EOLs? I think they've been normalized recently so it should be LF. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1218: Javacc erick
dweiss commented on a change in pull request #1218: Javacc erick URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371657147 ## File path: gradle/generation/javacc.gradle ## @@ -0,0 +1,102 @@ +// Add a top-level pseudo-task to which we will attach individual regenerate tasks. +import static groovy.io.FileType.* + +configure(rootProject) { + configurations { +javacc + } + + dependencies { +javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}" + } + + task javacc() { +description "Regenerate sources for corresponding javacc grammar files." +group "generation" + +dependsOn ":lucene:queryparser:javaccParserClassic" +dependsOn ":lucene:queryparser:javaccParserSurround" +dependsOn ":lucene:queryparser:javaccParserFlexible" + } +} + +// We always regenerate, no need to declare outputs. +class JavaCCTask extends DefaultTask { + @Input + File javaccFile + + JavaCCTask() { +dependsOn(project.rootProject.configurations.javacc) + } + + @TaskAction + def generate() { +if (!javaccFile || !javaccFile.exists()) { + throw new RuntimeException("JavaCC input file does not exist: ${javaccFile}") +} +// Remove old files so we can regenerate them +def parentDir = javaccFile.parentFile +parentDir.eachFileMatch FILES, ~/.*\.java/, { file -> + if (file.text.contains("Generated By:JavaCC")) { +file.delete() + } +} +logger.lifecycle("Regenerating JavaCC:\n from: ${javaccFile}\nto: ${parentDir}") + +project.javaexec { + classpath { +project.rootProject.configurations.javacc + } + main = "org.javacc.parser.Main" + args += "-OUTPUT_DIRECTORY=${parentDir}" + args += [javaccFile] Review comment: no need for array around javaccFile? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14201) some SolrCore are not released after being removed
[ https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinh Le updated SOLR-14201: --- Attachment: image-2020-01-28-16-20-50-709.png > some SolrCore are not released after being removed > -- > > Key: SOLR-14201 > URL: https://issues.apache.org/jira/browse/SOLR-14201 > Project: Solr > Issue Type: Bug >Reporter: Christine Poerschke >Priority: Major > Attachments: image-2020-01-22-10-39-15-301.png, > image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, > image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, > image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png > > > [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0): > bq. In 7.7.2, some SolrCore still are not released after being removed. > https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png > Starting this ticket for a separate investigation and fix. A next > investigative step could be to try and reproduce the issue on the latest 8.x > release. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1218: Javacc erick
dweiss commented on a change in pull request #1218: Javacc erick URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371657030 ## File path: gradle/generation/javacc.gradle ## @@ -0,0 +1,102 @@ +// Add a top-level pseudo-task to which we will attach individual regenerate tasks. +import static groovy.io.FileType.* + +configure(rootProject) { + configurations { +javacc + } + + dependencies { +javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}" + } + + task javacc() { +description "Regenerate sources for corresponding javacc grammar files." +group "generation" + +dependsOn ":lucene:queryparser:javaccParserClassic" +dependsOn ":lucene:queryparser:javaccParserSurround" +dependsOn ":lucene:queryparser:javaccParserFlexible" + } +} + +// We always regenerate, no need to declare outputs. +class JavaCCTask extends DefaultTask { + @Input + File javaccFile + + JavaCCTask() { +dependsOn(project.rootProject.configurations.javacc) + } + + @TaskAction + def generate() { +if (!javaccFile || !javaccFile.exists()) { + throw new RuntimeException("JavaCC input file does not exist: ${javaccFile}") +} +// Remove old files so we can regenerate them +def parentDir = javaccFile.parentFile +parentDir.eachFileMatch FILES, ~/.*\.java/, { file -> Review comment: Oh, what's FILES -- I don't know this construct. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1218: Javacc erick
dweiss commented on a change in pull request #1218: Javacc erick URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371657978 ## File path: gradle/generation/javacc.gradle ## @@ -0,0 +1,102 @@ +// Add a top-level pseudo-task to which we will attach individual regenerate tasks. +import static groovy.io.FileType.* + +configure(rootProject) { + configurations { +javacc + } + + dependencies { +javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}" + } + + task javacc() { +description "Regenerate sources for corresponding javacc grammar files." +group "generation" + +dependsOn ":lucene:queryparser:javaccParserClassic" +dependsOn ":lucene:queryparser:javaccParserSurround" +dependsOn ":lucene:queryparser:javaccParserFlexible" + } +} + +// We always regenerate, no need to declare outputs. +class JavaCCTask extends DefaultTask { + @Input + File javaccFile + + JavaCCTask() { +dependsOn(project.rootProject.configurations.javacc) + } + + @TaskAction + def generate() { +if (!javaccFile || !javaccFile.exists()) { + throw new RuntimeException("JavaCC input file does not exist: ${javaccFile}") +} +// Remove old files so we can regenerate them +def parentDir = javaccFile.parentFile +parentDir.eachFileMatch FILES, ~/.*\.java/, { file -> + if (file.text.contains("Generated By:JavaCC")) { +file.delete() + } +} +logger.lifecycle("Regenerating JavaCC:\n from: ${javaccFile}\nto: ${parentDir}") + +project.javaexec { + classpath { +project.rootProject.configurations.javacc + } + main = "org.javacc.parser.Main" + args += "-OUTPUT_DIRECTORY=${parentDir}" + args += [javaccFile] +} + } +} + + +configure(project(":lucene:queryparser")) { + task javaccParserClassic(type: JavaCCTask) { +description "Regenerate classic query parser from java CC.java" +group "generation" + +javaccFile = file('src/java/org/apache/lucene/queryparser/classic/QueryParser.jj') +def parent = javaccFile.parentFile.toString() // I'll need this later. + +doLast { + // There'll be a lot of cleanup in here to get precommits and builds to pass, but as long as we don't Review comment: I think it'd be ideal to regenerate with ant first (to eliminate any overlays that have accumulated), commit that, then regenerate with gradle. With any local patches applied the result should be identical -- that's how you'll know the process is the same as with ant (git diff should be empty)? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1218: Javacc erick
dweiss commented on a change in pull request #1218: Javacc erick URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371656935 ## File path: gradle/generation/javacc.gradle ## @@ -0,0 +1,102 @@ +// Add a top-level pseudo-task to which we will attach individual regenerate tasks. +import static groovy.io.FileType.* + +configure(rootProject) { + configurations { +javacc + } + + dependencies { +javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}" + } + + task javacc() { +description "Regenerate sources for corresponding javacc grammar files." +group "generation" + +dependsOn ":lucene:queryparser:javaccParserClassic" +dependsOn ":lucene:queryparser:javaccParserSurround" +dependsOn ":lucene:queryparser:javaccParserFlexible" + } +} + +// We always regenerate, no need to declare outputs. +class JavaCCTask extends DefaultTask { + @Input + File javaccFile + + JavaCCTask() { +dependsOn(project.rootProject.configurations.javacc) + } + + @TaskAction + def generate() { +if (!javaccFile || !javaccFile.exists()) { + throw new RuntimeException("JavaCC input file does not exist: ${javaccFile}") +} +// Remove old files so we can regenerate them +def parentDir = javaccFile.parentFile +parentDir.eachFileMatch FILES, ~/.*\.java/, { file -> Review comment: If these files get overwritten I don't think we should care about explicit deletions (?). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14201) some SolrCore are not released after being removed
[ https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024957#comment-17024957 ] Vinh Le commented on SOLR-14201: hmm, interesting, I assumed that optimize is synchronous (as have to wait). Let me try each case: h3. *Removing optimize step* **It seems classes loaded still keeps increasing !image-2020-01-28-16-17-44-030.png|width=540,height=389! One thing I forgot to mention is, if I click on "Perform GC" button in VisualVM, to trigger GC manually. !image-2020-01-28-16-19-43-760.png|width=329,height=106! Immediately, we can see a drop in classes loaded in VisualVM UI, !image-2020-01-28-16-20-50-709.png|width=915,height=544! (at 4:20PM) But checking via Metrics APIs {quote}http --timeout=300 "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"' 9635 {quote} It remains the same. > some SolrCore are not released after being removed > -- > > Key: SOLR-14201 > URL: https://issues.apache.org/jira/browse/SOLR-14201 > Project: Solr > Issue Type: Bug >Reporter: Christine Poerschke >Priority: Major > Attachments: image-2020-01-22-10-39-15-301.png, > image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, > image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, > image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png > > > [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0): > bq. In 7.7.2, some SolrCore still are not released after being removed. > https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png > Starting this ticket for a separate investigation and fix. A next > investigative step could be to try and reproduce the issue on the latest 8.x > release. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14201) some SolrCore are not released after being removed
[ https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinh Le updated SOLR-14201: --- Attachment: image-2020-01-28-16-59-51-813.png > some SolrCore are not released after being removed > -- > > Key: SOLR-14201 > URL: https://issues.apache.org/jira/browse/SOLR-14201 > Project: Solr > Issue Type: Bug >Reporter: Christine Poerschke >Priority: Major > Attachments: image-2020-01-22-10-39-15-301.png, > image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, > image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, > image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png, > image-2020-01-28-16-59-51-813.png > > > [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0): > bq. In 7.7.2, some SolrCore still are not released after being removed. > https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png > Starting this ticket for a separate investigation and fix. A next > investigative step could be to try and reproduce the issue on the latest 8.x > release. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14201) some SolrCore are not released after being removed
[ https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024969#comment-17024969 ] Vinh Le commented on SOLR-14201: h3. Without both optimize and alias {quote}HOST=http://localhost:8983/solr # Base on alias # PREV_COLLECTION=$(http --timeout=300 "$HOST/admin/collections?action=LISTALIASES" | jq -r ".aliases.SGFAS") # Base on last collection PREV_COLLECTION=$(http --timeout=300 "$HOST/admin/collections?action=LIST" | jq -r ".collections[0]") COLLECTION="next_$(gdate +%H%M%S)" # COLLECTION="next_1029" echo "Create new collection = $COLLECTION" http --timeout=300 POST "$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1" echo "Push data to new collection" cat docs.xml | http --timeout=300 POST "$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" "Content-Type: text/xml" # echo "Optimize" # http --timeout=300 "$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false" # echo "Update alias" # http --timeout=300 "$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS" echo "Delete previous collection = $PREV_COLLECTION" http --timeout=300 "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION" echo "Classes.loaded" http --timeout=300 "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"'{quote} Basically, just remove the previous collection after creating a new one. !image-2020-01-28-16-59-51-813.png|width=853,height=645! Classes loaded still keeps increasing. > some SolrCore are not released after being removed > -- > > Key: SOLR-14201 > URL: https://issues.apache.org/jira/browse/SOLR-14201 > Project: Solr > Issue Type: Bug >Reporter: Christine Poerschke >Priority: Major > Attachments: image-2020-01-22-10-39-15-301.png, > image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, > image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, > image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png, > image-2020-01-28-16-59-51-813.png > > > [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0): > bq. In 7.7.2, some SolrCore still are not released after being removed. > https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png > Starting this ticket for a separate investigation and fix. A next > investigative step could be to try and reproduce the issue on the latest 8.x > release. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14201) some SolrCore are not released after being removed
[ https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024969#comment-17024969 ] Vinh Le edited comment on SOLR-14201 at 1/28/20 9:01 AM: - h3. Without both optimize and alias {code:java} // code placeholder {code} Basically, just remove the previous collection after creating a new one. !image-2020-01-28-16-59-51-813.png|width=853,height=645! Classes loaded still keeps increasing. was (Author: vinhlh): h3. Without both optimize and alias {quote}HOST=http://localhost:8983/solr # Base on alias # PREV_COLLECTION=$(http --timeout=300 "$HOST/admin/collections?action=LISTALIASES" | jq -r ".aliases.SGFAS") # Base on last collection PREV_COLLECTION=$(http --timeout=300 "$HOST/admin/collections?action=LIST" | jq -r ".collections[0]") COLLECTION="next_$(gdate +%H%M%S)" # COLLECTION="next_1029" echo "Create new collection = $COLLECTION" http --timeout=300 POST "$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1" echo "Push data to new collection" cat docs.xml | http --timeout=300 POST "$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" "Content-Type: text/xml" # echo "Optimize" # http --timeout=300 "$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false" # echo "Update alias" # http --timeout=300 "$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS" echo "Delete previous collection = $PREV_COLLECTION" http --timeout=300 "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION" echo "Classes.loaded" http --timeout=300 "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"'{quote} Basically, just remove the previous collection after creating a new one. !image-2020-01-28-16-59-51-813.png|width=853,height=645! Classes loaded still keeps increasing. > some SolrCore are not released after being removed > -- > > Key: SOLR-14201 > URL: https://issues.apache.org/jira/browse/SOLR-14201 > Project: Solr > Issue Type: Bug >Reporter: Christine Poerschke >Priority: Major > Attachments: image-2020-01-22-10-39-15-301.png, > image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, > image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, > image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png, > image-2020-01-28-16-59-51-813.png > > > [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0): > bq. In 7.7.2, some SolrCore still are not released after being removed. > https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png > Starting this ticket for a separate investigation and fix. A next > investigative step could be to try and reproduce the issue on the latest 8.x > release. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14201) some SolrCore are not released after being removed
[ https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024969#comment-17024969 ] Vinh Le edited comment on SOLR-14201 at 1/28/20 9:02 AM: - h3. Without both optimize and alias {code:java} #!/bin/bash -e HOST=http://localhost:8983/solr # Base on alias # PREV_COLLECTION=$(http --timeout=300 "$HOST/admin/collections?action=LISTALIASES" | jq -r ".aliases.SGFAS") # Base on last collection PREV_COLLECTION=$(http --timeout=300 "$HOST/admin/collections?action=LIST" | jq -r ".collections[0]")COLLECTION="next_$(gdate +%H%M%S)" # COLLECTION="next_1029" echo "Create new collection = $COLLECTION" http --timeout=300 POST "$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1" echo "Push data to new collection" cat docs.xml | http --timeout=300 POST "$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" "Content-Type: text/xml" # echo "Optimize" # http --timeout=300 "$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false" # echo "Update alias" # http --timeout=300 "$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS" echo "Delete previous collection = $PREV_COLLECTION" http --timeout=300 "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION" echo "Classes.loaded" http --timeout=300 "http://localhost:8983/solr/admin/metrics"; | jq '.metrics."solr.jvm"."classes.loaded"' {code} Basically, just remove the previous collection after creating a new one. !image-2020-01-28-16-59-51-813.png|width=853,height=645! Classes loaded still keeps increasing. was (Author: vinhlh): h3. Without both optimize and alias {code:java} // code placeholder {code} Basically, just remove the previous collection after creating a new one. !image-2020-01-28-16-59-51-813.png|width=853,height=645! Classes loaded still keeps increasing. > some SolrCore are not released after being removed > -- > > Key: SOLR-14201 > URL: https://issues.apache.org/jira/browse/SOLR-14201 > Project: Solr > Issue Type: Bug >Reporter: Christine Poerschke >Priority: Major > Attachments: image-2020-01-22-10-39-15-301.png, > image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, > image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, > image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png, > image-2020-01-28-16-59-51-813.png > > > [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0): > bq. In 7.7.2, some SolrCore still are not released after being removed. > https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png > Starting this ticket for a separate investigation and fix. A next > investigative step could be to try and reproduce the issue on the latest 8.x > release. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020
[ https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl resolved SOLR-14224. Resolution: Invalid Please do not use Jira as a question asking forum. Use the mailing list [solr-u...@lucene.apache.org|mailto:solr-u...@lucene.apache.org] to ask such questions I'm closing this as invalid. > Not able to build solr 6.6.2 from source after January 15, 2020 > --- > > Key: SOLR-14224 > URL: https://issues.apache.org/jira/browse/SOLR-14224 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.6.2 >Reporter: Guruprasad K K >Priority: Major > > After Jan 15th maven is allowing only https connections to repo. But solr > 6.6.2 version uses http connection. So our builds are failing. > But looks like latest version of solr has the fix to this in common_build.xml > and other places where it uses https connection to maven. > What is the work around for this if we cant upgrade the solr version and > still if we want to use 6.6.2? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] andywebb1975 commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse
andywebb1975 commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse URL: https://github.com/apache/lucene-solr/pull/1210#discussion_r371717458 ## File path: solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java ## @@ -26,7 +26,9 @@ import java.util.Objects; public class OverseerSolrResponse extends SolrResponse { - + + private static final long serialVersionUID = 4721653044098960880L; Review comment: hi Tomas, I still think it'd be better to set the serialVersionUID. It's _possible_ (though I think unlikely*) that there are systems where the previous (computed) value is different to `472165...`, but the new computed value (with the change to the class) would be different anyway, so either way they'll see an incompatibility. On systems using the standard build, we can make the new class backwards-compatible by adding serialVersionUID. It's guaranteed to be incompatible for everyone if we don't. Andy \* My reading of https://docs.oracle.com/javase/7/docs/platform/serialization/spec/class.html is that the computed UID is independent of the Java version, and that it should be set to the previously-computed value in version 2+ of a class. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Reopened] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020
[ https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guruprasad K K reopened SOLR-14224: --- This is not a question. This is a bug after jan 15th > Not able to build solr 6.6.2 from source after January 15, 2020 > --- > > Key: SOLR-14224 > URL: https://issues.apache.org/jira/browse/SOLR-14224 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.6.2 >Reporter: Guruprasad K K >Priority: Major > > After Jan 15th maven is allowing only https connections to repo. But solr > 6.6.2 version uses http connection. So our builds are failing. > But looks like latest version of solr has the fix to this in common_build.xml > and other places where it uses https connection to maven. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020
[ https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guruprasad K K updated SOLR-14224: -- Description: After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 version uses http connection. So our builds are failing. But looks like latest version of solr has the fix to this in common_build.xml and other places where it uses https connection to maven. was: After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 version uses http connection. So our builds are failing. But looks like latest version of solr has the fix to this in common_build.xml and other places where it uses https connection to maven. What is the work around for this if we cant upgrade the solr version and still if we want to use 6.6.2? > Not able to build solr 6.6.2 from source after January 15, 2020 > --- > > Key: SOLR-14224 > URL: https://issues.apache.org/jira/browse/SOLR-14224 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.6.2 >Reporter: Guruprasad K K >Priority: Major > > After Jan 15th maven is allowing only https connections to repo. But solr > 6.6.2 version uses http connection. So our builds are failing. > But looks like latest version of solr has the fix to this in common_build.xml > and other places where it uses https connection to maven. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020
[ https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guruprasad K K updated SOLR-14224: -- Description: After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 version uses http connection. So builds are failing. But looks like latest version of solr has the fix to this in common_build.xml and other places where it uses https connection to maven. was: After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 version uses http connection. So our builds are failing. But looks like latest version of solr has the fix to this in common_build.xml and other places where it uses https connection to maven. > Not able to build solr 6.6.2 from source after January 15, 2020 > --- > > Key: SOLR-14224 > URL: https://issues.apache.org/jira/browse/SOLR-14224 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.6.2 >Reporter: Guruprasad K K >Priority: Major > > After Jan 15th maven is allowing only https connections to repo. But solr > 6.6.2 version uses http connection. So builds are failing. > But looks like latest version of solr has the fix to this in common_build.xml > and other places where it uses https connection to maven. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
Robert Muir created LUCENE-9185: --- Summary: add "tests.profile" to gradle build to aid fixing slow tests Key: LUCENE-9185 URL: https://issues.apache.org/jira/browse/LUCENE-9185 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-9185.patch It is kind of a hassle to profile slow tests to fix the bottlenecks The idea here is to make it dead easy to profile (just) the tests, capturing samples at a very low granularity, reducing noise as much as possible (e.g. not profiling entire gradle build or anything) and print a simple report for quick iterating. Here's a prototype of what I hacked together: All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} {noformat} ... PROFILE SUMMARY from 122464 samples tests.profile.count=10 tests.profile.stacksize=1 tests.profile.linenumbers=false PERCENT SAMPLES STACK 2.59% 3170 org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() 2.26% 2762java.util.Arrays#fill() 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() 1.24% 1523java.util.Random#nextInt() 1.19% 1456java.lang.StringUTF16#compress() 1.08% 1319java.lang.StringLatin1#inflate() 1.00% 1228java.lang.Integer#getChars() 0.99% 1214java.util.Arrays#compareUnsigned() 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() BUILD SUCCESSFUL in 3m 59s {noformat} If you look at this LZ4 assertReset method, you can see its indeed way too expensive, checking 64K items every time. To dig deeper into potential problems you can pass additional parameters (all of them used here for demonstration): {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true -Dtests.profile.count=8 -Dtests.profile.stacksize=20 -Dtests.profile.linenumbers=true}} This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... {noformat} ... PROFILE SUMMARY from 21355 samples tests.profile.count=8 tests.profile.stacksize=20 tests.profile.linenumbers=true PERCENT SAMPLES STACK 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) at sun.nio.ch.EPollSelectorImpl#doSelect():120 at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 at sun.nio.ch.SelectorImpl#select():141 at org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 at org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 at org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted code) at org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 at java.lang.Thread#run():830 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) at sun.nio.ch.EPollSelectorImpl#doSelect():120 at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 at sun.nio.ch.SelectorImpl#select():141 at org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 at org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 at org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted code) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor#lambda$execute$0():210 at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$270.1779693615#run():(Interpreted code) at java.util.concurrent.ThreadPoolExecutor#runWorker():1128 at java.util.concurrent.ThreadPoolExecutor$Worker#run():628 at java.lang.Thread#run():830 13.15% 2808sun.nio.ch.Net#accept():(Na
[jira] [Updated] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9185: Attachment: LUCENE-9185.patch > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor#lambda$execute$0():210 >
[jira] [Updated] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020
[ https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guruprasad K K updated SOLR-14224: -- Description: After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 version uses http connection. So builds are failing. But looks like latest version of solr has the fix to this in common_build.xml and other places where it uses https connection to maven. Error log: ivy-bootstrap1: [mkdir] Created dir: /root/.ant/lib [echo] installing ivy 2.3.0 to /root/.ant/lib [get] Getting: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] To: /root/.ant/lib/ivy-2.3.0.jar [get] Error opening connection [java.io|http://java.io/] .IOException: Server returned HTTP response code: 501 for URL: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] Error opening connection [java.io|http://java.io/] .IOException: Server returned HTTP response code: 501 for URL: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] Error opening connection [java.io|http://java.io/] .IOException: Server returned HTTP response code: 501 for URL: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] Can't get [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] to /root/.ant/lib/ivy-2.3.0.jar was: After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 version uses http connection. So builds are failing. But looks like latest version of solr has the fix to this in common_build.xml and other places where it uses https connection to maven. > Not able to build solr 6.6.2 from source after January 15, 2020 > --- > > Key: SOLR-14224 > URL: https://issues.apache.org/jira/browse/SOLR-14224 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.6.2 >Reporter: Guruprasad K K >Priority: Major > > After Jan 15th maven is allowing only https connections to repo. But solr > 6.6.2 version uses http connection. So builds are failing. > But looks like latest version of solr has the fix to this in common_build.xml > and other places where it uses https connection to maven. > > Error log: > ivy-bootstrap1: > [mkdir] Created dir: /root/.ant/lib > [echo] installing ivy 2.3.0 to /root/.ant/lib > [get] Getting: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] To: /root/.ant/lib/ivy-2.3.0.jar > [get] Error opening connection > [java.io|http://java.io/] > .IOException: Server returned HTTP response code: 501 for URL: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] Error opening connection > [java.io|http://java.io/] > .IOException: Server returned HTTP response code: 501 for URL: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] Error opening connection > [java.io|http://java.io/] > .IOException: Server returned HTTP response code: 501 for URL: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] Can't get > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > to /root/.ant/lib/ivy-2.3.0.jar -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025032#comment-17025032 ] Robert Muir commented on LUCENE-9185: - Attached is my initial stab... its helpful to me at least when tracking these things down. cc [~dweiss] > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) >
[jira] [Updated] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020
[ https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guruprasad K K updated SOLR-14224: -- Description: After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 version uses http connection. So builds are failing. But looks like latest version of solr has the fix to this in common_build.xml and other places where it uses https connection to maven. Error log: ivy-bootstrap1: [mkdir] Created dir: /root/.ant/lib [echo] installing ivy 2.3.0 to /root/.ant/lib [get] Getting: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] To: /root/.ant/lib/ivy-2.3.0.jar [get] Error opening connection [java.io|http://java.io/] .IOException: Server returned HTTP response code: 501 for URL: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] Error opening connection [java.io|http://java.io/] .IOException: Server returned HTTP response code: 501 for URL: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] Error opening connection [java.io|http://java.io/] .IOException: Server returned HTTP response code: 501 for URL: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] Can't get [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] to /root/.ant/lib/ivy-2.3.0.jar [NOTE]: It works on latest version of solr, where http is converted to https was: After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 version uses http connection. So builds are failing. But looks like latest version of solr has the fix to this in common_build.xml and other places where it uses https connection to maven. Error log: ivy-bootstrap1: [mkdir] Created dir: /root/.ant/lib [echo] installing ivy 2.3.0 to /root/.ant/lib [get] Getting: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] To: /root/.ant/lib/ivy-2.3.0.jar [get] Error opening connection [java.io|http://java.io/] .IOException: Server returned HTTP response code: 501 for URL: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] Error opening connection [java.io|http://java.io/] .IOException: Server returned HTTP response code: 501 for URL: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] Error opening connection [java.io|http://java.io/] .IOException: Server returned HTTP response code: 501 for URL: [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] [get] Can't get [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] to /root/.ant/lib/ivy-2.3.0.jar > Not able to build solr 6.6.2 from source after January 15, 2020 > --- > > Key: SOLR-14224 > URL: https://issues.apache.org/jira/browse/SOLR-14224 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.6.2 >Reporter: Guruprasad K K >Priority: Major > > After Jan 15th maven is allowing only https connections to repo. But solr > 6.6.2 version uses http connection. So builds are failing. > But looks like latest version of solr has the fix to this in common_build.xml > and other places where it uses https connection to maven. > > Error log: > ivy-bootstrap1: > [mkdir] Created dir: /root/.ant/lib > [echo] installing ivy 2.3.0 to /root/.ant/lib > [get] Getting: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] To: /root/.ant/lib/ivy-2.3.0.jar > [get] Error opening connection > [java.io|http://java.io/] > .IOException: Server returned HTTP response code: 501 for URL: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] Error opening connection > [java.io|http://java.io/] > .IOException: Server returned HTTP response code: 501 for URL: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] Error opening connection > [java.io|http://java.io/] > .IOException: Server returned HTTP response code: 501 for URL: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] Can't get > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > to /root/.ant/lib/ivy-2.3.0.jar > > > > > [NOTE]: It works on latest version of solr, where http is converted to https -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9186) remove linefiledocs usage from basetokenstreamtestcase
Robert Muir created LUCENE-9186: --- Summary: remove linefiledocs usage from basetokenstreamtestcase Key: LUCENE-9186 URL: https://issues.apache.org/jira/browse/LUCENE-9186 Project: Lucene - Core Issue Type: Task Components: general/test Reporter: Robert Muir LineFileDocs is slow, even to open. That's because it (very slowly) "skips" to a pseudorandom position into a 5MB gzip stream when you open it. There was a time when we didn't have a nice string generator for tests (TestUtil.randomAnalysisString), but now we do. And when it was introduced it found interesting new things that linefiledocs never found. This speeds up all the analyzer tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9186) remove linefiledocs usage from basetokenstreamtestcase
[ https://issues.apache.org/jira/browse/LUCENE-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9186: Attachment: LUCENE-9186.patch > remove linefiledocs usage from basetokenstreamtestcase > -- > > Key: LUCENE-9186 > URL: https://issues.apache.org/jira/browse/LUCENE-9186 > Project: Lucene - Core > Issue Type: Task > Components: general/test >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9186.patch > > > LineFileDocs is slow, even to open. That's because it (very slowly) "skips" > to a pseudorandom position into a 5MB gzip stream when you open it. > There was a time when we didn't have a nice string generator for tests > (TestUtil.randomAnalysisString), but now we do. And when it was introduced it > found interesting new things that linefiledocs never found. > This speeds up all the analyzer tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable
Robert Muir created LUCENE-9187: --- Summary: remove too-expensive assert from LZ4 HighCompressionHashTable Key: LUCENE-9187 URL: https://issues.apache.org/jira/browse/LUCENE-9187 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir This is the slowest method in the lucene tests. See LUCENE-9185 for what I mean. If you look at it, its checking 64k values every time the assert is called. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable
[ https://issues.apache.org/jira/browse/LUCENE-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9187: Attachment: LUCENE-9187.patch > remove too-expensive assert from LZ4 HighCompressionHashTable > - > > Key: LUCENE-9187 > URL: https://issues.apache.org/jira/browse/LUCENE-9187 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9187.patch > > > This is the slowest method in the lucene tests. See LUCENE-9185 for what I > mean. > If you look at it, its checking 64k values every time the assert is called. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable
[ https://issues.apache.org/jira/browse/LUCENE-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025048#comment-17025048 ] Robert Muir commented on LUCENE-9187: - cc [~jpountz] > remove too-expensive assert from LZ4 HighCompressionHashTable > - > > Key: LUCENE-9187 > URL: https://issues.apache.org/jira/browse/LUCENE-9187 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9187.patch > > > This is the slowest method in the lucene tests. See LUCENE-9185 for what I > mean. > If you look at it, its checking 64k values every time the assert is called. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] alessandrobenedetti commented on issue #357: [SOLR-12238] Synonym Queries boost by payload
alessandrobenedetti commented on issue #357: [SOLR-12238] Synonym Queries boost by payload URL: https://github.com/apache/lucene-solr/pull/357#issuecomment-579200844 I followed the refactor comments from both @diegoceccarelli and @romseygeek . The PR seems much cleaner right now both Lucene and Solr side. Copious tests are present and should cover the various situations. Few questions remain: - from a test I read a comment from @dsmiley saying: "confirm autoGeneratePhraseQueries always builds OR queries" from org.apache.solr.search.TestSolrQueryParser#testSynonymQueryStyle - what can we do for the SpanBoostQuery, I was completely not aware they are going to be deprecated Let me know This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12238) Synonym Query Style Boost By Payload
[ https://issues.apache.org/jira/browse/SOLR-12238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025050#comment-17025050 ] Alessandro Benedetti commented on SOLR-12238: - I followed the refactor comments from both @diegoceccarelli and @romseygeek . The PR seems much cleaner right now both Lucene and Solr side. Copious tests are present and should cover the various situations. Few questions remain: - from a test I read a comment from @dsmiley saying: "confirm autoGeneratePhraseQueries always builds OR queries" from org.apache.solr.search.TestSolrQueryParser#testSynonymQueryStyle - what can we do for the SpanBoostQuery, I was completely not aware they are going to be deprecated Let me know > Synonym Query Style Boost By Payload > > > Key: SOLR-12238 > URL: https://issues.apache.org/jira/browse/SOLR-12238 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 7.2 >Reporter: Alessandro Benedetti >Priority: Major > Attachments: SOLR-12238.patch, SOLR-12238.patch, SOLR-12238.patch, > SOLR-12238.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > This improvement is built on top of the Synonym Query Style feature and > brings the possibility of boosting synonym queries using the payload > associated. > It introduces two new modalities for the Synonym Query Style : > PICK_BEST_BOOST_BY_PAYLOAD -> build a Disjunction query with the clauses > boosted by payload > AS_DISTINCT_TERMS_BOOST_BY_PAYLOAD -> build a Boolean query with the clauses > boosted by payload > This new synonym query styles will assume payloads are available so they must > be used in conjunction with a token filter able to produce payloads. > An synonym.txt example could be : > # Synonyms used by Payload Boost > tiger => tiger|1.0, Big_Cat|0.8, Shere_Khan|0.9 > leopard => leopard, Big_Cat|0.8, Bagheera|0.9 > lion => lion|1.0, panthera leo|0.99, Simba|0.8 > snow_leopard => panthera uncia|0.99, snow leopard|1.0 > A simple token filter to populate the payloads from such synonym.txt is : > delimiter="|"/> -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson commented on issue #1218: LUCENE-9134: Javacc skeleton
ErickErickson commented on issue #1218: LUCENE-9134: Javacc skeleton URL: https://github.com/apache/lucene-solr/pull/1218#issuecomment-579219139 Didn't title it right. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson closed pull request #1218: LUCENE-9134: Javacc skeleton
ErickErickson closed pull request #1218: LUCENE-9134: Javacc skeleton URL: https://github.com/apache/lucene-solr/pull/1218 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson opened a new pull request #1219: LUCENE-9134: Javacc skeleton for Gradle regenerate
ErickErickson opened a new pull request #1219: LUCENE-9134: Javacc skeleton for Gradle regenerate URL: https://github.com/apache/lucene-solr/pull/1219 Here's the build changes to get javacc to run, modeled on the jflex changes , many thanks for the model. Only two files changed here ;) If the structure is OK, I'll fill in the "doLast" blocks with the cleanup code and maybe be able extract some common parts. NOTE: you can't even compile the result of running this because I wanted the changes to the build structure to be clear first so didn't include the cleanup tasks yet. So if this structure is OK, should I merge it into master before or after the rest of the cleanup? My assumption is after. I want to try to get all the warnings etc. out of the generated code in the next phase to reduce the temptation for people to make hand-edits. I didn't intentionally change the line endings in defaults-java, there's no other change there... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025074#comment-17025074 ] Dawid Weiss commented on LUCENE-9185: - Great beyond words. I never had a chance to use jfr but I'll surely want to dig. A few nitpicks: {code} +allprojects { + tasks.withType(Test) { +def profileMode = propertyOrDefault("tests.profile", false) +if (profileMode) { {code} you can apply the if outside and only do allprojects closure if tests.profile is true at the root project level (I assume we won't have to enable it for individual projects within a larger build). {code} +gradlew -p lucene/core test -Dtests.profile=true {code} It will work but -Ptests.profile=true would be more gradle-sque (it sets a project property as opposed to system property). {code} +gradle.buildFinished { + if (!recordings.isEmpty()) { +def args = ["ProfileResults"] +for (file in recordings.getFiles()) { + args += file.toString() +} +ProfileResults.main(args as String[]) + } +} {code} If you pull up the if then this thing can go underneath so that it's not adding any closure if it's not enabled. Also: it'll always display the profile, even on a failed build. Look at slowest-tests-at-end.gradle - this one only displays the slowest tests if the build is successful. Finally you may want to simplify to something like (didn't check but should work): {code} def args = ["ProfileResults"] args += recordings.getFiles().collect { it.toString() } {code} > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at >
[jira] [Commented] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable
[ https://issues.apache.org/jira/browse/LUCENE-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025085#comment-17025085 ] Adrien Grand commented on LUCENE-9187: -- This profile option is pretty cool. +1 to removing the assert, I'd like to make it a dedicated test instead but it doesn't have to block the removal of the assertion. > remove too-expensive assert from LZ4 HighCompressionHashTable > - > > Key: LUCENE-9187 > URL: https://issues.apache.org/jira/browse/LUCENE-9187 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9187.patch > > > This is the slowest method in the lucene tests. See LUCENE-9185 for what I > mean. > If you look at it, its checking 64k values every time the assert is called. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9186) remove linefiledocs usage from basetokenstreamtestcase
[ https://issues.apache.org/jira/browse/LUCENE-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025091#comment-17025091 ] Dawid Weiss commented on LUCENE-9186: - +1. > remove linefiledocs usage from basetokenstreamtestcase > -- > > Key: LUCENE-9186 > URL: https://issues.apache.org/jira/browse/LUCENE-9186 > Project: Lucene - Core > Issue Type: Task > Components: general/test >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9186.patch > > > LineFileDocs is slow, even to open. That's because it (very slowly) "skips" > to a pseudorandom position into a 5MB gzip stream when you open it. > There was a time when we didn't have a nice string generator for tests > (TestUtil.randomAnalysisString), but now we do. And when it was introduced it > found interesting new things that linefiledocs never found. > This speeds up all the analyzer tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9134) Port ant-regenerate tasks to Gradle build
[ https://issues.apache.org/jira/browse/LUCENE-9134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025101#comment-17025101 ] Erick Erickson commented on LUCENE-9134: New PR with skeleton of javacc changes. Just for the structure of the Gradle changes, won't be committable until after the post-generation cleanup is done. > Port ant-regenerate tasks to Gradle build > - > > Key: LUCENE-9134 > URL: https://issues.apache.org/jira/browse/LUCENE-9134 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: LUCENE-9134.patch, core_regen.patch > > Time Spent: 6h 20m > Remaining Estimate: 0h > > Take II about organizing this beast. > A list of items that needs to be added or requires work. If you'd like to > work on any of these, please add your name to the list. See process comments > at parent (LUCENE-9077) > * Implement jflex task in lucene/core > * Implement jflex tasks in lucene/analysis > * Implement javacc tasks in lucene/queryparser (EOE) > * Implement javacc tasks in solr/core (EOE) > * Implement python tasks in lucene (? there are several javadocs mentions in > the build.xml, this may be irrelevant to the Gradle effort). > * Implement python tasks in lucene/core > * Implement python tasks in lucene/analysis > > Here are the "regenerate" targets I found in the ant version. There are a > couple that I don't have evidence for or against being rebuilt > // Very top level > {code:java} > ./build.xml: > ./build.xml: failonerror="true"> > ./build.xml: depends="regenerate,-check-after-regeneration"/> > {code} > // top level Lucene. This includes the core/build.xml and > test-framework/build.xml files > {code:java} > ./lucene/build.xml: > ./lucene/build.xml: inheritall="false"> > ./lucene/build.xml: > {code} > // This one has quite a number of customizations to > {code:java} > ./lucene/core/build.xml: depends="createLevAutomata,createPackedIntSources,jflex"/> > {code} > // This one has a bunch of code modifications _after_ javacc is run on > certain of the > // output files. Save this one for last? > {code:java} > ./lucene/queryparser/build.xml: > {code} > // the files under ../lucene/analysis... are pretty self contained. I expect > these could be done as a unit > {code:java} > ./lucene/analysis/build.xml: > ./lucene/analysis/build.xml: > ./lucene/analysis/common/build.xml: depends="jflex,unicode-data"/> > ./lucene/analysis/icu/build.xml: depends="gen-utr30-data-files,gennorm2,genrbbi"/> > ./lucene/analysis/kuromoji/build.xml: depends="build-dict"/> > ./lucene/analysis/nori/build.xml: depends="build-dict"/> > ./lucene/analysis/opennlp/build.xml: depends="train-test-models"/> > {code} > > // These _are_ regenerated from the top-level regenerate target, but for – > LUCENE-9080//the changes were only in imports so there are no > //corresponding files checked in in that JIRA > {code:java} > ./lucene/expressions/build.xml: depends="run-antlr"/> > {code} > // Apparently unrelated to ./lucene/analysis/opennlp/build.xml > "train-test-models" target > // Apparently not rebuilt from the top level, but _are_ regenerated when > executed from > // ./solr/contrib/langid > {code:java} > ./solr/contrib/langid/build.xml: depends="train-test-models"/> > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020
[ https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-14224. --- Resolution: Invalid We stopped active support for Solr 6x quite some time ago and will not be releasing any new versions. Arguing about whether it's a bug or not is pointless, please ask the question on the user's list as Jan suggested and do not reopen this JIRA. > Not able to build solr 6.6.2 from source after January 15, 2020 > --- > > Key: SOLR-14224 > URL: https://issues.apache.org/jira/browse/SOLR-14224 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.6.2 >Reporter: Guruprasad K K >Priority: Major > > After Jan 15th maven is allowing only https connections to repo. But solr > 6.6.2 version uses http connection. So builds are failing. > But looks like latest version of solr has the fix to this in common_build.xml > and other places where it uses https connection to maven. > > Error log: > ivy-bootstrap1: > [mkdir] Created dir: /root/.ant/lib > [echo] installing ivy 2.3.0 to /root/.ant/lib > [get] Getting: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] To: /root/.ant/lib/ivy-2.3.0.jar > [get] Error opening connection > [java.io|http://java.io/] > .IOException: Server returned HTTP response code: 501 for URL: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] Error opening connection > [java.io|http://java.io/] > .IOException: Server returned HTTP response code: 501 for URL: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] Error opening connection > [java.io|http://java.io/] > .IOException: Server returned HTTP response code: 501 for URL: > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > [get] Can't get > [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar] > to /root/.ant/lib/ivy-2.3.0.jar > > > > > [NOTE]: It works on latest version of solr, where http is converted to https -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton
ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371799564 ## File path: gradle/defaults-java.gradle ## @@ -25,13 +25,13 @@ allprojects { tasks.withType(JavaCompile) { options.encoding = "UTF-8" options.compilerArgs += [ -"-Xlint", Review comment: OK, I'll check. I don't even know how they got changed frankly, I'll revert This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton
ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371799808 ## File path: gradle/generation/javacc.gradle ## @@ -0,0 +1,102 @@ +// Add a top-level pseudo-task to which we will attach individual regenerate tasks. +import static groovy.io.FileType.* + +configure(rootProject) { + configurations { +javacc + } + + dependencies { +javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}" + } + + task javacc() { +description "Regenerate sources for corresponding javacc grammar files." +group "generation" + +dependsOn ":lucene:queryparser:javaccParserClassic" +dependsOn ":lucene:queryparser:javaccParserSurround" +dependsOn ":lucene:queryparser:javaccParserFlexible" + } +} + +// We always regenerate, no need to declare outputs. +class JavaCCTask extends DefaultTask { + @Input + File javaccFile + + JavaCCTask() { +dependsOn(project.rootProject.configurations.javacc) + } + + @TaskAction + def generate() { +if (!javaccFile || !javaccFile.exists()) { + throw new RuntimeException("JavaCC input file does not exist: ${javaccFile}") +} +// Remove old files so we can regenerate them +def parentDir = javaccFile.parentFile +parentDir.eachFileMatch FILES, ~/.*\.java/, { file -> Review comment: I copied it from some example and it worked... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton
ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371801798 ## File path: gradle/generation/javacc.gradle ## @@ -0,0 +1,102 @@ +// Add a top-level pseudo-task to which we will attach individual regenerate tasks. +import static groovy.io.FileType.* + +configure(rootProject) { + configurations { +javacc + } + + dependencies { +javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}" + } + + task javacc() { +description "Regenerate sources for corresponding javacc grammar files." +group "generation" + +dependsOn ":lucene:queryparser:javaccParserClassic" +dependsOn ":lucene:queryparser:javaccParserSurround" +dependsOn ":lucene:queryparser:javaccParserFlexible" + } +} + +// We always regenerate, no need to declare outputs. +class JavaCCTask extends DefaultTask { + @Input + File javaccFile + + JavaCCTask() { +dependsOn(project.rootProject.configurations.javacc) + } + + @TaskAction + def generate() { +if (!javaccFile || !javaccFile.exists()) { + throw new RuntimeException("JavaCC input file does not exist: ${javaccFile}") +} +// Remove old files so we can regenerate them +def parentDir = javaccFile.parentFile +parentDir.eachFileMatch FILES, ~/.*\.java/, { file -> Review comment: Actually, they aren't overwritten. If they're not deleted you get messages during execution like: "Warning: TokenMgrError.java: File is obsolete. Please rename or delete this file so that a new one can be generated for you." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton
ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371801937 ## File path: gradle/generation/javacc.gradle ## @@ -0,0 +1,102 @@ +// Add a top-level pseudo-task to which we will attach individual regenerate tasks. +import static groovy.io.FileType.* + +configure(rootProject) { + configurations { +javacc + } + + dependencies { +javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}" + } + + task javacc() { +description "Regenerate sources for corresponding javacc grammar files." +group "generation" + +dependsOn ":lucene:queryparser:javaccParserClassic" +dependsOn ":lucene:queryparser:javaccParserSurround" +dependsOn ":lucene:queryparser:javaccParserFlexible" + } +} + +// We always regenerate, no need to declare outputs. +class JavaCCTask extends DefaultTask { + @Input + File javaccFile + + JavaCCTask() { +dependsOn(project.rootProject.configurations.javacc) + } + + @TaskAction + def generate() { +if (!javaccFile || !javaccFile.exists()) { + throw new RuntimeException("JavaCC input file does not exist: ${javaccFile}") +} +// Remove old files so we can regenerate them +def parentDir = javaccFile.parentFile +parentDir.eachFileMatch FILES, ~/.*\.java/, { file -> + if (file.text.contains("Generated By:JavaCC")) { +file.delete() + } +} +logger.lifecycle("Regenerating JavaCC:\n from: ${javaccFile}\nto: ${parentDir}") + +project.javaexec { + classpath { +project.rootProject.configurations.javacc + } + main = "org.javacc.parser.Main" + args += "-OUTPUT_DIRECTORY=${parentDir}" + args += [javaccFile] Review comment: I'll change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025108#comment-17025108 ] Robert Muir commented on LUCENE-9185: - {quote} Also: it'll always display the profile, even on a failed build. Look at slowest-tests-at-end.gradle - this one only displays the slowest tests if the build is successful. {quote} Honestly when looking at slow solr tests, I remove that logic locally from {{slowest-tests-at-end.gradle}}. It takes me 80 minutes to run solr tests, and 90% of the time some test fails and then i get no output from it at all. This is frustrating because then I wasted 80 minutes. I feel the same way about it here, its about performance, and you asked for profile output, and it found jfr files, why not show it? > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.ec
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025110#comment-17025110 ] Dawid Weiss commented on LUCENE-9185: - Ok, fair enough. With profiling it's explicit; those slow-tests are always shown. Maybe we should make the latter optional as well? > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run()
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025112#comment-17025112 ] Robert Muir commented on LUCENE-9185: - {quote} It will work but -Ptests.profile=true would be more gradle-sque (it sets a project property as opposed to system property). {quote} The tool uses actual system properties for the more advanced options (e.g. {{-Dtests.profile.count=20}}). Seems a little evil to mix -P's and -D's when documenting this? I'll be honest, the difference is super confusing. > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.E
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025116#comment-17025116 ] Robert Muir commented on LUCENE-9185: - {quote} Maybe we should make the latter optional as well? {quote} Do you mean the whole {{slowest-test-at-end}}? Given how insanely slow some of these tests are, I feel it should be mandatory to see it? :) But if i had to ask for a wishlist of improvements to {{slowest-tests-at-end}}, they would be: * option (or change behavior) to print them always, even if a test sporadically failed. * property to increase the count (e.g. from 10 to 100) and threshold (e.g. from 500ms to 250ms, yes we may get there soon in lucene!) * some way to show or count beforeclass/afterclass time. I'm not sure it is currently considered, only time for each method (i assume that includes setup+teardown) * some way to see the slowest suites, too. Even if we fix all the tests to be 100ms, it can cause bottlenecks if a suite has a TON of tests, because of bad gradle load balancing. > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.Se
[GitHub] [lucene-solr] shalinmangar opened a new pull request #1220: SOLR-13996: Refactor HttpShardHandler.prepDistributed method
shalinmangar opened a new pull request #1220: SOLR-13996: Refactor HttpShardHandler.prepDistributed method URL: https://github.com/apache/lucene-solr/pull/1220 # Description This PR refactors the huge HttpShardHandler.prepDistributed method into smaller pieces. # Solution It separates the logic for cloud and non-cloud modes into separate classes which are implementations of a new (experimental/internal) interface named ReplicaSource. # Tests This PR passes all current tests and I'll add more tests before merging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-13996) Refactor HttpShardHandler#prepDistributed() into smaller pieces
[ https://issues.apache.org/jira/browse/SOLR-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar reassigned SOLR-13996: Assignee: Shalin Shekhar Mangar > Refactor HttpShardHandler#prepDistributed() into smaller pieces > --- > > Key: SOLR-13996 > URL: https://issues.apache.org/jira/browse/SOLR-13996 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Shalin Shekhar Mangar >Priority: Major > Attachments: SOLR-13996.patch, SOLR-13996.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Currently, it is very hard to understand all the various things being done in > HttpShardHandler. I'm starting with refactoring the prepDistributed() method > to make it easier to grasp. It has standalone and cloud code intertwined, and > wanted to cleanly separate them out. Later, we can even have two separate > method (for standalone and cloud, each). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025120#comment-17025120 ] Dawid Weiss commented on LUCENE-9185: - It looks odd to your eyes because it's a legacy from ant. These are different things: system properties are global, project properties are local (or looked up via scopes). You can set project properties with finer granularity than globally. As for the patch: it works because you invoke a static method on that class and it inherits gradle's environment. A nicer way to do it would be to pass arguments like tests.profile.count explicitly to ProfileResults (via args, setters or otherwise) preparing them on gradle side. The propertyOrDefault utility is actually a hack in the build so that people used to global system properties can still pass them to gradle build... maybe it was a mistake that I added it in the first place, don't know. > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():47
[jira] [Commented] (SOLR-13996) Refactor HttpShardHandler#prepDistributed() into smaller pieces
[ https://issues.apache.org/jira/browse/SOLR-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025122#comment-17025122 ] Shalin Shekhar Mangar commented on SOLR-13996: -- I've been working on a refactoring of this method and it's my fault that I didn't see this issue and the PR earlier. However, my goals are a bit more ambitious. This first PR https://github.com/apache/lucene-solr/pull/1220 is just a re-organization of the code but I'll be expanding it further by adding tests for each individual case and then move on to improve performance. Currently this class is quite inefficient as it parses and re-parses and creates strings out of shard urls even for solr cloud cases. The goal is to eventually have a cloud focused class that is extremely efficient and avoids unnecessary copies of shards/replicas completely. This will require changes in other places as well e.g. the host checker can be made to operate in a streaming mode etc. I haven't quite decided on how the replica list transformer should be changed. I hope you don't mind Ishan but I'll assign this issue and take this forward. Reviews welcome! > Refactor HttpShardHandler#prepDistributed() into smaller pieces > --- > > Key: SOLR-13996 > URL: https://issues.apache.org/jira/browse/SOLR-13996 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Shalin Shekhar Mangar >Priority: Major > Attachments: SOLR-13996.patch, SOLR-13996.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Currently, it is very hard to understand all the various things being done in > HttpShardHandler. I'm starting with refactoring the prepDistributed() method > to make it easier to grasp. It has standalone and cloud code intertwined, and > wanted to cleanly separate them out. Later, we can even have two separate > method (for standalone and cloud, each). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9171) Synonyms Boost by Payload
[ https://issues.apache.org/jira/browse/LUCENE-9171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025124#comment-17025124 ] David Smiley commented on LUCENE-9171: -- I have my doubts on AttributeSource and arrays of such; I'll put my comments in the PR in a minute. BTW I agree with Alan about keeping things simple in its base class. In Lucene we fight complexity all the time. > Synonyms Boost by Payload > - > > Key: LUCENE-9171 > URL: https://issues.apache.org/jira/browse/LUCENE-9171 > Project: Lucene - Core > Issue Type: New Feature > Components: core/queryparser >Reporter: Alessandro Benedetti >Priority: Major > > I have been working in the additional capability of boosting queries by terms > payload through a parameter to enable it in Lucene Query Builder. > This has been done targeting the Synonyms Query. > It is parametric, so it meant to see no difference unless the feature is > enabled. > Solr has its bits to comply thorugh its SynonymsQueryStyles -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on issue #357: [SOLR-12238] Synonym Queries boost by payload
dsmiley commented on issue #357: [SOLR-12238] Synonym Queries boost by payload URL: https://github.com/apache/lucene-solr/pull/357#issuecomment-579257834 I noticed the use of {{AttributeSource[]}} (array of AttributeSource), done at the behest of @romseygeek . That seems fishy... shouldn't it be a TokenStream, which is a more memory efficient iterator over AttributeSource changing state? I see, for example, the _existing_ {{createSpanQuery(TokenStream in, String field)}} but the PR adds {{newSpanQuery(String field, AttributeSource[] attributes)}} and makes the former call the latter. Why bother; why not retain createSpanQuery and if Solr wants to override it to do payload boosting then it can? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025128#comment-17025128 ] Dawid Weiss commented on LUCENE-9185: - bq. But if i had to ask for a wishlist of improvements All of them make sense but you're killing me... ;) It's also worth nothing that the "slowest tests" list depends on the level of parallelism and what other tests ran in the background alongside (one memory or I/O heavy test slows down everything running with it). > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 >
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025129#comment-17025129 ] Robert Muir commented on LUCENE-9185: - {quote} As for the patch: it works because you invoke a static method on that class and it inherits gradle's environment. A nicer way to do it would be to pass arguments like tests.profile.count explicitly to ProfileResults (via args, setters or otherwise) preparing them on gradle side. {quote} I know, i wanted to keep a simple main() method, to make it easy to improve or fix bugs, iterate quickly, e.g. {noformat} $ java buildSrc/src/main/java/org/apache/lucene/gradle/ProfileResults.java ./lucene/analysis/opennlp/build/tmp/tests-cwd/hotspot-pid-133619-id-1-2020_01_28_06_11_03.jfr ./lucene/analysis/opennlp/build/tmp/tests-cwd/hotspot-pid-133548-id-1-2020_01_28_06_11_02.jfr PROFILE SUMMARY from 306 samples tests.profile.count=10 tests.profile.stacksize=1 tests.profile.linenumbers=false PERCENT SAMPLES STACK 13.73% 42 java.util.zip.Inflater#inflateBytesBytes() 2.94% 9 java.lang.StringLatin1#indexOf() 2.61% 8 java.io.UnixFileSystem#getBooleanAttributes0() 2.29% 7 java.util.DualPivotQuicksort#sort() 1.96% 6 java.lang.StringLatin1#charAt() 1.96% 6 java.io.UnixFileSystem#normalize() 1.63% 5 java.lang.StringLatin1#inflate() 1.31% 4 java.lang.String#startsWith() 1.31% 4 java.lang.ClassLoader#defineClass1() 1.31% 4 java.lang.StringLatin1#compareTo() {noformat} > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) >
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025142#comment-17025142 ] Robert Muir commented on LUCENE-9185: - {quote} All of them make sense but you're killing me... It's also worth nothing that the "slowest tests" list depends on the level of parallelism and what other tests ran in the background alongside (one memory or I/O heavy test slows down everything running with it). {quote} I know, but its all the rudimentary "profiling" we have at the moment. Trying to change that! > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.threa
[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton
ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371828729 ## File path: gradle/generation/javacc.gradle ## @@ -0,0 +1,102 @@ +// Add a top-level pseudo-task to which we will attach individual regenerate tasks. +import static groovy.io.FileType.* + +configure(rootProject) { + configurations { +javacc + } + + dependencies { +javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}" + } + + task javacc() { +description "Regenerate sources for corresponding javacc grammar files." +group "generation" + +dependsOn ":lucene:queryparser:javaccParserClassic" +dependsOn ":lucene:queryparser:javaccParserSurround" +dependsOn ":lucene:queryparser:javaccParserFlexible" + } +} + +// We always regenerate, no need to declare outputs. +class JavaCCTask extends DefaultTask { + @Input + File javaccFile + + JavaCCTask() { +dependsOn(project.rootProject.configurations.javacc) + } + + @TaskAction + def generate() { +if (!javaccFile || !javaccFile.exists()) { + throw new RuntimeException("JavaCC input file does not exist: ${javaccFile}") +} +// Remove old files so we can regenerate them +def parentDir = javaccFile.parentFile +parentDir.eachFileMatch FILES, ~/.*\.java/, { file -> + if (file.text.contains("Generated By:JavaCC")) { +file.delete() + } +} +logger.lifecycle("Regenerating JavaCC:\n from: ${javaccFile}\nto: ${parentDir}") + +project.javaexec { + classpath { +project.rootProject.configurations.javacc + } + main = "org.javacc.parser.Main" + args += "-OUTPUT_DIRECTORY=${parentDir}" + args += [javaccFile] +} + } +} + + +configure(project(":lucene:queryparser")) { + task javaccParserClassic(type: JavaCCTask) { +description "Regenerate classic query parser from java CC.java" +group "generation" + +javaccFile = file('src/java/org/apache/lucene/queryparser/classic/QueryParser.jj') +def parent = javaccFile.parentFile.toString() // I'll need this later. + +doLast { + // There'll be a lot of cleanup in here to get precommits and builds to pass, but as long as we don't Review comment: That _should_ be the end product already, that's one of the reasons I spent so much time on the ant version and why all those files were changed when I committed. At least I _think_ I got them all. At least that's what I remember doing... That said I'll try not to go off in the weeds. Now that I've got the structure right, I'll see if I can get this to happen. Shouldn't actually be that much. Oh, and ignore PR 1219, I had a bad title for this PR and it didn't link. When I changed the title of this one it took a while to show up and I got impatient. 1219 and 1218 are identical. Finally, many thanks for your coaching (well, ok, outright fixing things)! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9185: Attachment: LUCENE-9185.patch > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch, LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor#lambda$execut
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025193#comment-17025193 ] Robert Muir commented on LUCENE-9185: - [~dweiss] I tried to fold in your feedback, can you take another look? > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch, LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) >
[jira] [Created] (SOLR-14225) Upgrade jaegertracing
Jan Høydahl created SOLR-14225: -- Summary: Upgrade jaegertracing Key: SOLR-14225 URL: https://issues.apache.org/jira/browse/SOLR-14225 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Jan Høydahl Upgrade jaegertracing from 0.35.5 to 1.1.0. This will also give us a newer libthrift which is more stable and secure -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9185: Attachment: LUCENE-9185.patch > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExe
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025220#comment-17025220 ] Robert Muir commented on LUCENE-9185: - I added a bunch of crazy abstractions and constants to the java code so that the gradle code looks a little prettier. I realize you really hate how i did it before, but I want to keep the simple main method, and I don't think gradle's bad decisions should get in the way of that. > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 >
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025233#comment-17025233 ] Dawid Weiss commented on LUCENE-9185: - It looks great, thanks Robert. I'd love to have some kind of task to display all these build options at some point. Currently this is done just for randomization options (try gradlew testOpts -p lucene/core) but I'm sure it could be pulled from other parts of the build and displayed consistently. For now it can stay as it is. > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.stra
[jira] [Commented] (SOLR-14225) Upgrade jaegertracing
[ https://issues.apache.org/jira/browse/SOLR-14225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025234#comment-17025234 ] Dawid Weiss commented on SOLR-14225: It'd be great if the patch included corresponding gradle updates, Jan (if you have problems with something, let me know). > Upgrade jaegertracing > - > > Key: SOLR-14225 > URL: https://issues.apache.org/jira/browse/SOLR-14225 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Jan Høydahl >Priority: Major > > Upgrade jaegertracing from 0.35.5 to 1.1.0. This will also give us a newer > libthrift which is more stable and secure -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025238#comment-17025238 ] Robert Muir commented on LUCENE-9185: - I agree, it would be nice. For now I added basic usage to the help and the reporter itself prints out the values of any fancy options. just trying to make it as easy as possible to keep the slow tests at bay... > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYou
[jira] [Created] (LUCENE-9188) Add jacoco code coverage support to gradle build
Robert Muir created LUCENE-9188: --- Summary: Add jacoco code coverage support to gradle build Key: LUCENE-9188 URL: https://issues.apache.org/jira/browse/LUCENE-9188 Project: Lucene - Core Issue Type: Task Components: general/build Reporter: Robert Muir Seems to be missing. I looked into it a little, all the documented ways of using the jacoco plugin seem to involve black magic if you are using "java" plugin, but we are using "javaLibrary", so I wasn't able to hold it right. This one should work very well, it has low overhead and should work fine running tests in parallel (since it supports merging of coverage data files: that's how it works in the ant build) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] alessandrobenedetti commented on issue #357: [SOLR-12238] Synonym Queries boost by payload
alessandrobenedetti commented on issue #357: [SOLR-12238] Synonym Queries boost by payload URL: https://github.com/apache/lucene-solr/pull/357#issuecomment-579326000 No strong opinion on that, it was actually the first time I used the AttributeSource so I am happy to switch to TokenStream if it is more memory efficient. The change shouldn't be too heavy. I will just wait for confirmation, and when we are all aligned I'll proceed with the implementation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9004) Approximate nearest vector search
[ https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025251#comment-17025251 ] Michael Sokolov commented on LUCENE-9004: - > Is there any possible to merge LUCENE-9136 with this issue? This is already gigantic - what would be the benefit of merging? > Approximate nearest vector search > - > > Key: LUCENE-9004 > URL: https://issues.apache.org/jira/browse/LUCENE-9004 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael Sokolov >Priority: Major > Attachments: hnsw_layered_graph.png > > Time Spent: 3h 10m > Remaining Estimate: 0h > > "Semantic" search based on machine-learned vector "embeddings" representing > terms, queries and documents is becoming a must-have feature for a modern > search engine. SOLR-12890 is exploring various approaches to this, including > providing vector-based scoring functions. This is a spinoff issue from that. > The idea here is to explore approximate nearest-neighbor search. Researchers > have found an approach based on navigating a graph that partially encodes the > nearest neighbor relation at multiple scales can provide accuracy > 95% (as > compared to exact nearest neighbor calculations) at a reasonable cost. This > issue will explore implementing HNSW (hierarchical navigable small-world) > graphs for the purpose of approximate nearest vector search (often referred > to as KNN or k-nearest-neighbor search). > At a high level the way this algorithm works is this. First assume you have a > graph that has a partial encoding of the nearest neighbor relation, with some > short and some long-distance links. If this graph is built in the right way > (has the hierarchical navigable small world property), then you can > efficiently traverse it to find nearest neighbors (approximately) in log N > time where N is the number of nodes in the graph. I believe this idea was > pioneered in [1]. The great insight in that paper is that if you use the > graph search algorithm to find the K nearest neighbors of a new document > while indexing, and then link those neighbors (undirectedly, ie both ways) to > the new document, then the graph that emerges will have the desired > properties. > The implementation I propose for Lucene is as follows. We need two new data > structures to encode the vectors and the graph. We can encode vectors using a > light wrapper around {{BinaryDocValues}} (we also want to encode the vector > dimension and have efficient conversion from bytes to floats). For the graph > we can use {{SortedNumericDocValues}} where the values we encode are the > docids of the related documents. Encoding the interdocument relations using > docids directly will make it relatively fast to traverse the graph since we > won't need to lookup through an id-field indirection. This choice limits us > to building a graph-per-segment since it would be impractical to maintain a > global graph for the whole index in the face of segment merges. However > graph-per-segment is a very natural at search time - we can traverse each > segments' graph independently and merge results as we do today for term-based > search. > At index time, however, merging graphs is somewhat challenging. While > indexing we build a graph incrementally, performing searches to construct > links among neighbors. When merging segments we must construct a new graph > containing elements of all the merged segments. Ideally we would somehow > preserve the work done when building the initial graphs, but at least as a > start I'd propose we construct a new graph from scratch when merging. The > process is going to be limited, at least initially, to graphs that can fit > in RAM since we require random access to the entire graph while constructing > it: In order to add links bidirectionally we must continually update existing > documents. > I think we want to express this API to users as a single joint > {{KnnGraphField}} abstraction that joins together the vectors and the graph > as a single joint field type. Mostly it just looks like a vector-valued > field, but has this graph attached to it. > I'll push a branch with my POC and would love to hear comments. It has many > nocommits, basic design is not really set, there is no Query implementation > and no integration iwth IndexSearcher, but it does work by some measure using > a standalone test class. I've tested with uniform random vectors and on my > laptop indexed 10K documents in around 10 seconds and searched them at 95% > recall (compared with exact nearest-neighbor baseline) at around 250 QPS. I > haven't made any attempt to use multithreaded search for this, but it is > amenable to per-segment concurrency. > [1] > [https://www.semanticscholar.org/
[jira] [Created] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
Robert Muir created LUCENE-9189: --- Summary: TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes Key: LUCENE-9189 URL: https://issues.apache.org/jira/browse/LUCENE-9189 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir I thought it was just the testUpdatesOnDiskFull, but looks like this one needs to be nightly too. Should look more into the test, but I know something causes it to make such an insane amount of files, that sorting them becomes a bottleneck. I guess also related is that it would be great if MockDirectoryWrapper's disk full check didn't trigger a sort of the files (via listAll): it does this check on like every i/o, would be nice for it to be less absurd. Maybe instead the test could check for disk full on not every i/o but some random sample of them? Temporarily lets make it nightly... {noformat} PROFILE SUMMARY from 182501 samples tests.profile.count=10 tests.profile.stacksize=1 tests.profile.linenumbers=false PERCENT SAMPLES STACK 15.89% 28995 java.lang.StringLatin1#compareTo() 6.61% 12069 java.util.TimSort#mergeHi() 5.96% 10878 java.util.TimSort#binarySort() 3.41% 6231java.util.concurrent.ConcurrentHashMap#tabAt() 2.98% 5433java.util.Comparators$NaturalOrderComparator#compare() 2.12% 3876org.apache.lucene.store.DataOutput#copyBytes() 2.03% 3712java.lang.String#compareTo() 1.84% 3350java.util.concurrent.ConcurrentHashMap#get() 1.83% 3337java.util.TimSort#mergeLo() 1.67% 3047java.util.ArrayList#add() {noformat} All the file sorting is called from stacks like this, so its literally happening every writeByte() and so on {noformat} 0.73% 1329java.util.TimSort#binarySort() at java.util.TimSort#sort() at java.util.Arrays#sort() at java.util.ArrayList#sort() at java.util.stream.SortedOps$RefSortingSink#end() at java.util.stream.AbstractPipeline#copyInto() at java.util.stream.AbstractPipeline#wrapAndCopyInto() at java.util.stream.AbstractPipeline#evaluate() at java.util.stream.AbstractPipeline#evaluateToArrayNode() at java.util.stream.ReferencePipeline#toArray() at org.apache.lucene.store.ByteBuffersDirectory#listAll() at org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes() at org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull() at org.apache.lucene.store.MockIndexOutputWrapper#writeBytes() at org.apache.lucene.store.MockIndexOutputWrapper#writeByte() at org.apache.lucene.store.DataOutput#writeInt() at org.apache.lucene.codecs.CodecUtil#writeFooter() at org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs() at org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs() at org.apache.lucene.index.PendingDeletes#writeLiveDocs() {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13756) ivy cannot download org.restlet.ext.servlet jar
[ https://issues.apache.org/jira/browse/SOLR-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025259#comment-17025259 ] Zsolt Gyulavari commented on SOLR-13756: I've rebased and addressed the gradle build as well, however I think the cloudera repo is not needed anymore if not for the backup purposes. Otherwise we can remove it altogether. What do you think? > ivy cannot download org.restlet.ext.servlet jar > --- > > Key: SOLR-13756 > URL: https://issues.apache.org/jira/browse/SOLR-13756 > Project: Solr > Issue Type: Bug >Reporter: Chongchen Chen >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > I checkout the project and run `ant idea`, it will try to download jars. But > https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar > will return 404 now. > [ivy:retrieve] public: tried > [ivy:retrieve] > https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar > [ivy:retrieve]:: > [ivy:retrieve]:: FAILED DOWNLOADS:: > [ivy:retrieve]:: ^ see resolution messages for details ^ :: > [ivy:retrieve]:: > [ivy:retrieve]:: > org.restlet.jee#org.restlet;2.3.0!org.restlet.jar > [ivy:retrieve]:: > org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar > [ivy:retrieve]:: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9185. - Fix Version/s: master (9.0) Resolution: Fixed > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at >
[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests
[ https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025263#comment-17025263 ] ASF subversion and git services commented on LUCENE-9185: - Commit e504798a44e5f1577d87ef3a43d9d1e3a859d68a in lucene-solr's branch refs/heads/master from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e504798 ] LUCENE-9185: add "tests.profile" to gradle build to aid fixing slow tests Run test(s) with -Ptests.profile=true to print a histogram at the end of the build. > add "tests.profile" to gradle build to aid fixing slow tests > > > Key: LUCENE-9185 > URL: https://issues.apache.org/jira/browse/LUCENE-9185 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch > > > It is kind of a hassle to profile slow tests to fix the bottlenecks > The idea here is to make it dead easy to profile (just) the tests, capturing > samples at a very low granularity, reducing noise as much as possible (e.g. > not profiling entire gradle build or anything) and print a simple report for > quick iterating. > Here's a prototype of what I hacked together: > All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}} > {noformat} > ... > PROFILE SUMMARY from 122464 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 2.59% 3170 > org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset() > 2.26% 2762java.util.Arrays#fill() > 1.59% 1953com.carrotsearch.randomizedtesting.RandomizedContext#context() > 1.24% 1523java.util.Random#nextInt() > 1.19% 1456java.lang.StringUTF16#compress() > 1.08% 1319java.lang.StringLatin1#inflate() > 1.00% 1228java.lang.Integer#getChars() > 0.99% 1214java.util.Arrays#compareUnsigned() > 0.96% 1179java.util.zip.Inflater#inflateBytesBytes() > 0.91% 1114java.util.concurrent.atomic.AtomicLong#compareAndSet() > BUILD SUCCESSFUL in 3m 59s > {noformat} > If you look at this LZ4 assertReset method, you can see its indeed way too > expensive, checking 64K items every time. > To dig deeper into potential problems you can pass additional parameters (all > of them used here for demonstration): > {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true > -Dtests.profile.count=8 -Dtests.profile.stacksize=20 > -Dtests.profile.linenumbers=true}} > This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ... > {noformat} > ... > PROFILE SUMMARY from 21355 samples > tests.profile.count=8 > tests.profile.stacksize=20 > tests.profile.linenumbers=true > PERCENT SAMPLES STACK > 26.30% 5617sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135 > at > org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted > code) > at > org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806 > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938 > at java.lang.Thread#run():830 > 16.19% 3458sun.nio.ch.EPoll#wait():(Native code) > at sun.nio.ch.EPollSelectorImpl#doSelect():120 > at sun.nio.ch.SelectorImpl#lockAndDoSelect():124 > at sun.nio.ch.SelectorImpl#select():141 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472 > at > org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360 > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184 >
[jira] [Commented] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
[ https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025270#comment-17025270 ] Robert Muir commented on LUCENE-9189: - I'm guessing there is something such as a copyBytes that goes one byte at a time or similar stuff causing it to be truly pathological. > TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes > --- > > Key: LUCENE-9189 > URL: https://issues.apache.org/jira/browse/LUCENE-9189 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > > I thought it was just the testUpdatesOnDiskFull, but looks like this one > needs to be nightly too. > Should look more into the test, but I know something causes it to make such > an insane amount of files, that sorting them becomes a bottleneck. > I guess also related is that it would be great if MockDirectoryWrapper's disk > full check didn't trigger a sort of the files (via listAll): it does this > check on like every i/o, would be nice for it to be less absurd. Maybe > instead the test could check for disk full on not every i/o but some random > sample of them? > Temporarily lets make it nightly... > {noformat} > PROFILE SUMMARY from 182501 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 15.89%28995 java.lang.StringLatin1#compareTo() > 6.61% 12069 java.util.TimSort#mergeHi() > 5.96% 10878 java.util.TimSort#binarySort() > 3.41% 6231java.util.concurrent.ConcurrentHashMap#tabAt() > 2.98% 5433java.util.Comparators$NaturalOrderComparator#compare() > 2.12% 3876org.apache.lucene.store.DataOutput#copyBytes() > 2.03% 3712java.lang.String#compareTo() > 1.84% 3350java.util.concurrent.ConcurrentHashMap#get() > 1.83% 3337java.util.TimSort#mergeLo() > 1.67% 3047java.util.ArrayList#add() > {noformat} > All the file sorting is called from stacks like this, so its literally > happening every writeByte() and so on > {noformat} > 0.73% 1329java.util.TimSort#binarySort() > at java.util.TimSort#sort() > at java.util.Arrays#sort() > at java.util.ArrayList#sort() > at java.util.stream.SortedOps$RefSortingSink#end() > at java.util.stream.AbstractPipeline#copyInto() > at java.util.stream.AbstractPipeline#wrapAndCopyInto() > at java.util.stream.AbstractPipeline#evaluate() > at > java.util.stream.AbstractPipeline#evaluateToArrayNode() > at java.util.stream.ReferencePipeline#toArray() > at > org.apache.lucene.store.ByteBuffersDirectory#listAll() > at > org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeByte() > at org.apache.lucene.store.DataOutput#writeInt() > at org.apache.lucene.codecs.CodecUtil#writeFooter() > at > org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.index.PendingDeletes#writeLiveDocs() > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9190) add dedicated test to assert internals of LZ4 hashtable
Robert Muir created LUCENE-9190: --- Summary: add dedicated test to assert internals of LZ4 hashtable Key: LUCENE-9190 URL: https://issues.apache.org/jira/browse/LUCENE-9190 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir This assert was called all the time by all tests, causing a bottleneck. I disabled it in LUCENE-9187, but it would be nice to add a subclass or package-private method or something to still test it (without taking up tons of cpu). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable
[ https://issues.apache.org/jira/browse/LUCENE-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025276#comment-17025276 ] ASF subversion and git services commented on LUCENE-9187: - Commit 4350efa932a4c6aaad1943857c935bafce98fe56 in lucene-solr's branch refs/heads/master from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4350efa ] LUCENE-9187: remove too-expensive assert from LZ4 HighCompressionHashTable > remove too-expensive assert from LZ4 HighCompressionHashTable > - > > Key: LUCENE-9187 > URL: https://issues.apache.org/jira/browse/LUCENE-9187 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9187.patch > > > This is the slowest method in the lucene tests. See LUCENE-9185 for what I > mean. > If you look at it, its checking 64k values every time the assert is called. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable
[ https://issues.apache.org/jira/browse/LUCENE-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9187. - Fix Version/s: master (9.0) Resolution: Fixed I opened LUCENE-9190 as a followup for the dedicated test idea so we don't lose it. > remove too-expensive assert from LZ4 HighCompressionHashTable > - > > Key: LUCENE-9187 > URL: https://issues.apache.org/jira/browse/LUCENE-9187 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9187.patch > > > This is the slowest method in the lucene tests. See LUCENE-9185 for what I > mean. > If you look at it, its checking 64k values every time the assert is called. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9191) Fix linefiledocs compression or replace in tests
Robert Muir created LUCENE-9191: --- Summary: Fix linefiledocs compression or replace in tests Key: LUCENE-9191 URL: https://issues.apache.org/jira/browse/LUCENE-9191 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir LineFileDocs(random) is very slow, even to open. It does a very slow "random skip" through a gzip compressed file. For the analyzers tests, in LUCENE-9186 I simply removed its usage, since TestUtil.randomAnalysisString is superior, and fast. But we should address other tests using it, since LineFileDocs(random) is slow! I think it is also the case that every lucene test has probably tested every LineFileDocs line many times now, whereas randomAnalysisString will invent new ones. Alternatively, we could "fix" LineFileDocs(random), e.g. special compression options (in blocks)... deflate supports such stuff. But it would make it even hairier than it is now. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] balaji-s opened a new pull request #1221: SOLR-14193 Update tutorial.adoc(line no:664) so that command executes…
balaji-s opened a new pull request #1221: SOLR-14193 Update tutorial.adoc(line no:664) so that command executes… URL: https://github.com/apache/lucene-solr/pull/1221 … in windows enviroment # Description Please provide a short description of the changes you're making with this pull request. # Solution Please provide a short description of the approach taken to implement your solution. # Tests Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem. # Checklist Please review the following and check all that apply: - [ ] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [ ] I have created a Jira issue and added the issue ID to my pull request title. - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [ ] I have developed this patch against the `master` branch. - [ ] I have run `ant precommit` and the appropriate test suite. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13289) Support for BlockMax WAND
[ https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025286#comment-17025286 ] Gregg Donovan commented on SOLR-13289: -- {quote}This feature currently doesn't work in case of faceting(this is expected), grouping.{quote} Will WAND cause faceting to break entirely? Or will the counts for facets just be inexact? {quote}as same minExactHits is shared across shard. so, actual minExactHits is shardCount*minExactHits{quote} Perhaps it would be worth having an additional parameter for a perShardExactHits? E.g. if we're requesting the top 1000 hits across 64 shards, we'd likely be fine with WAND getting the top, say, 150 per shard. > Support for BlockMax WAND > - > > Key: SOLR-13289 > URL: https://issues.apache.org/jira/browse/SOLR-13289 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13289.patch, SOLR-13289.patch > > > LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to > expose this via Solr. When enabled, the numFound returned will not be exact. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9186) remove linefiledocs usage from basetokenstreamtestcase
[ https://issues.apache.org/jira/browse/LUCENE-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025289#comment-17025289 ] ASF subversion and git services commented on LUCENE-9186: - Commit 3bcc97c8eb70f4a3a309d4cdab290363b524b0a2 in lucene-solr's branch refs/heads/master from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3bcc97c ] LUCENE-9186: remove linefiledocs usage from BaseTokenStreamTestCase > remove linefiledocs usage from basetokenstreamtestcase > -- > > Key: LUCENE-9186 > URL: https://issues.apache.org/jira/browse/LUCENE-9186 > Project: Lucene - Core > Issue Type: Task > Components: general/test >Reporter: Robert Muir >Priority: Major > Attachments: LUCENE-9186.patch > > > LineFileDocs is slow, even to open. That's because it (very slowly) "skips" > to a pseudorandom position into a 5MB gzip stream when you open it. > There was a time when we didn't have a nice string generator for tests > (TestUtil.randomAnalysisString), but now we do. And when it was introduced it > found interesting new things that linefiledocs never found. > This speeds up all the analyzer tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9186) remove linefiledocs usage from basetokenstreamtestcase
[ https://issues.apache.org/jira/browse/LUCENE-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9186. - Fix Version/s: master (9.0) Resolution: Fixed I opened LUCENE-9191 as a followup for other tests using LineFileDocs in a similar way. But fixing the analyzers tests was an easy win. > remove linefiledocs usage from basetokenstreamtestcase > -- > > Key: LUCENE-9186 > URL: https://issues.apache.org/jira/browse/LUCENE-9186 > Project: Lucene - Core > Issue Type: Task > Components: general/test >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9186.patch > > > LineFileDocs is slow, even to open. That's because it (very slowly) "skips" > to a pseudorandom position into a 5MB gzip stream when you open it. > There was a time when we didn't have a nice string generator for tests > (TestUtil.randomAnalysisString), but now we do. And when it was introduced it > found interesting new things that linefiledocs never found. > This speeds up all the analyzer tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14193) Update tutorial.adoc(line no:664) so that command executes in windows enviroment
[ https://issues.apache.org/jira/browse/SOLR-14193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] balaji sundaram updated SOLR-14193: --- Attachment: solr-tutorial.adoc Status: Open (was: Open) > Update tutorial.adoc(line no:664) so that command executes in windows > enviroment > > > Key: SOLR-14193 > URL: https://issues.apache.org/jira/browse/SOLR-14193 > Project: Solr > Issue Type: Bug > Components: documentation >Affects Versions: 8.4 >Reporter: balaji sundaram >Priority: Minor > Attachments: solr-tutorial.adoc > > Time Spent: 10m > Remaining Estimate: 0h > > > {{When executing the following command in windows 10 "java -jar -Dc=films > -Dparams=f.genre.split=true&f.directed_by.split=true&f.genre.separator=|&f.directed_by.separator=| > -Dauto example\exampledocs\post.jar example\films\*.csv", it throws error "& > was unexpected at this time."}} > Fix: the command should escape "&" and "|" symbol{{}} > {{}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] balaji-s commented on issue #1221: SOLR-14193 Update tutorial.adoc(line no:664) so that command executes…
balaji-s commented on issue #1221: SOLR-14193 Update tutorial.adoc(line no:664) so that command executes… URL: https://github.com/apache/lucene-solr/pull/1221#issuecomment-579353263 Updated line no:664 in solr-tutorial.adoc. Added escape characters for ^ and | symbols in windows environment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13817) Deprecate and remove legacy SolrCache implementations
[ https://issues.apache.org/jira/browse/SOLR-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025295#comment-17025295 ] Andy Webb commented on SOLR-13817: -- Could I put in a request that we get to use the final version of CaffeineCache in 8.5.0+ before the legacy cache implementations are removed in 9.0.0 please? Currently https://github.com/apache/lucene-solr/commit/b4fe911cc8e4bddff18226bc8c98a2deb735a8fc#diff-fc056ba10fcf92dc69fe32991cdad5f0 (in master) both updates CaffeineCache.java and removes FastLRUCache etc. thanks, Andy > Deprecate and remove legacy SolrCache implementations > - > > Key: SOLR-13817 > URL: https://issues.apache.org/jira/browse/SOLR-13817 > Project: Solr > Issue Type: Improvement >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: master (9.0) > > Attachments: SOLR-13817-8x.patch, SOLR-13817-master.patch > > > Now that SOLR-8241 has been committed I propose to deprecate other cache > implementations in 8x and remove them altogether from 9.0, in order to reduce > confusion and maintenance costs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
[ https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025297#comment-17025297 ] Robert Muir commented on LUCENE-9189: - There are definitely test bugs here too. MockDirectoryWrapper shouldn't even be checking disk full here, it wasn't told to do so! So its copyBytes is bad, as it unconditionally does the expensive disk full check on every invocation (even if setTrackDiskUsage was never called, such as this test). So we definitely need to fix it to only check for disk full if the test asked for it, and then fix tests that want to test disk full to .setTrackDiskUsage(true). I'm looking in. > TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes > --- > > Key: LUCENE-9189 > URL: https://issues.apache.org/jira/browse/LUCENE-9189 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > > I thought it was just the testUpdatesOnDiskFull, but looks like this one > needs to be nightly too. > Should look more into the test, but I know something causes it to make such > an insane amount of files, that sorting them becomes a bottleneck. > I guess also related is that it would be great if MockDirectoryWrapper's disk > full check didn't trigger a sort of the files (via listAll): it does this > check on like every i/o, would be nice for it to be less absurd. Maybe > instead the test could check for disk full on not every i/o but some random > sample of them? > Temporarily lets make it nightly... > {noformat} > PROFILE SUMMARY from 182501 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 15.89%28995 java.lang.StringLatin1#compareTo() > 6.61% 12069 java.util.TimSort#mergeHi() > 5.96% 10878 java.util.TimSort#binarySort() > 3.41% 6231java.util.concurrent.ConcurrentHashMap#tabAt() > 2.98% 5433java.util.Comparators$NaturalOrderComparator#compare() > 2.12% 3876org.apache.lucene.store.DataOutput#copyBytes() > 2.03% 3712java.lang.String#compareTo() > 1.84% 3350java.util.concurrent.ConcurrentHashMap#get() > 1.83% 3337java.util.TimSort#mergeLo() > 1.67% 3047java.util.ArrayList#add() > {noformat} > All the file sorting is called from stacks like this, so its literally > happening every writeByte() and so on > {noformat} > 0.73% 1329java.util.TimSort#binarySort() > at java.util.TimSort#sort() > at java.util.Arrays#sort() > at java.util.ArrayList#sort() > at java.util.stream.SortedOps$RefSortingSink#end() > at java.util.stream.AbstractPipeline#copyInto() > at java.util.stream.AbstractPipeline#wrapAndCopyInto() > at java.util.stream.AbstractPipeline#evaluate() > at > java.util.stream.AbstractPipeline#evaluateToArrayNode() > at java.util.stream.ReferencePipeline#toArray() > at > org.apache.lucene.store.ByteBuffersDirectory#listAll() > at > org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeByte() > at org.apache.lucene.store.DataOutput#writeInt() > at org.apache.lucene.codecs.CodecUtil#writeFooter() > at > org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.index.PendingDeletes#writeLiveDocs() > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
[ https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025303#comment-17025303 ] Robert Muir commented on LUCENE-9189: - OK, I see the issue. it also "tracks" (by track we mean, recomputes by calling listAll and then summing fileLength of every file... on every writeByte etc) the disk usage if you setMaxSizeInBytes. So it only impacts these disk full tests. The tracking should get more efficient, but the scope is limited and I don't want to wrestle with this logic right now. Going with Nightly until we fix the efficiency of this thing. > TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes > --- > > Key: LUCENE-9189 > URL: https://issues.apache.org/jira/browse/LUCENE-9189 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > > I thought it was just the testUpdatesOnDiskFull, but looks like this one > needs to be nightly too. > Should look more into the test, but I know something causes it to make such > an insane amount of files, that sorting them becomes a bottleneck. > I guess also related is that it would be great if MockDirectoryWrapper's disk > full check didn't trigger a sort of the files (via listAll): it does this > check on like every i/o, would be nice for it to be less absurd. Maybe > instead the test could check for disk full on not every i/o but some random > sample of them? > Temporarily lets make it nightly... > {noformat} > PROFILE SUMMARY from 182501 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 15.89%28995 java.lang.StringLatin1#compareTo() > 6.61% 12069 java.util.TimSort#mergeHi() > 5.96% 10878 java.util.TimSort#binarySort() > 3.41% 6231java.util.concurrent.ConcurrentHashMap#tabAt() > 2.98% 5433java.util.Comparators$NaturalOrderComparator#compare() > 2.12% 3876org.apache.lucene.store.DataOutput#copyBytes() > 2.03% 3712java.lang.String#compareTo() > 1.84% 3350java.util.concurrent.ConcurrentHashMap#get() > 1.83% 3337java.util.TimSort#mergeLo() > 1.67% 3047java.util.ArrayList#add() > {noformat} > All the file sorting is called from stacks like this, so its literally > happening every writeByte() and so on > {noformat} > 0.73% 1329java.util.TimSort#binarySort() > at java.util.TimSort#sort() > at java.util.Arrays#sort() > at java.util.ArrayList#sort() > at java.util.stream.SortedOps$RefSortingSink#end() > at java.util.stream.AbstractPipeline#copyInto() > at java.util.stream.AbstractPipeline#wrapAndCopyInto() > at java.util.stream.AbstractPipeline#evaluate() > at > java.util.stream.AbstractPipeline#evaluateToArrayNode() > at java.util.stream.ReferencePipeline#toArray() > at > org.apache.lucene.store.ByteBuffersDirectory#listAll() > at > org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeByte() > at org.apache.lucene.store.DataOutput#writeInt() > at org.apache.lucene.codecs.CodecUtil#writeFooter() > at > org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.index.PendingDeletes#writeLiveDocs() > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
[ https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025307#comment-17025307 ] ASF subversion and git services commented on LUCENE-9189: - Commit 4773574578f089802fe3f36bff6951c4a29a3628 in lucene-solr's branch refs/heads/master from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4773574 ] LUCENE-9189: TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes The issue is that MockDirectoryWrapper's disk full check is horribly inefficient. On every writeByte/etc, it totally recomputes disk space across all files. This means it calls listAll() on the underlying Directory (which sorts all the underlying files), then sums up fileLength() for each of those files. This leads to many pathological cases in the disk full tests... but the number of tests impacted by this is minimal, and the logic is scary. > TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes > --- > > Key: LUCENE-9189 > URL: https://issues.apache.org/jira/browse/LUCENE-9189 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > > I thought it was just the testUpdatesOnDiskFull, but looks like this one > needs to be nightly too. > Should look more into the test, but I know something causes it to make such > an insane amount of files, that sorting them becomes a bottleneck. > I guess also related is that it would be great if MockDirectoryWrapper's disk > full check didn't trigger a sort of the files (via listAll): it does this > check on like every i/o, would be nice for it to be less absurd. Maybe > instead the test could check for disk full on not every i/o but some random > sample of them? > Temporarily lets make it nightly... > {noformat} > PROFILE SUMMARY from 182501 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 15.89%28995 java.lang.StringLatin1#compareTo() > 6.61% 12069 java.util.TimSort#mergeHi() > 5.96% 10878 java.util.TimSort#binarySort() > 3.41% 6231java.util.concurrent.ConcurrentHashMap#tabAt() > 2.98% 5433java.util.Comparators$NaturalOrderComparator#compare() > 2.12% 3876org.apache.lucene.store.DataOutput#copyBytes() > 2.03% 3712java.lang.String#compareTo() > 1.84% 3350java.util.concurrent.ConcurrentHashMap#get() > 1.83% 3337java.util.TimSort#mergeLo() > 1.67% 3047java.util.ArrayList#add() > {noformat} > All the file sorting is called from stacks like this, so its literally > happening every writeByte() and so on > {noformat} > 0.73% 1329java.util.TimSort#binarySort() > at java.util.TimSort#sort() > at java.util.Arrays#sort() > at java.util.ArrayList#sort() > at java.util.stream.SortedOps$RefSortingSink#end() > at java.util.stream.AbstractPipeline#copyInto() > at java.util.stream.AbstractPipeline#wrapAndCopyInto() > at java.util.stream.AbstractPipeline#evaluate() > at > java.util.stream.AbstractPipeline#evaluateToArrayNode() > at java.util.stream.ReferencePipeline#toArray() > at > org.apache.lucene.store.ByteBuffersDirectory#listAll() > at > org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeByte() > at org.apache.lucene.store.DataOutput#writeInt() > at org.apache.lucene.codecs.CodecUtil#writeFooter() > at > org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.index.PendingDeletes#writeLiveDocs() > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
[ https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9189. - Fix Version/s: master (9.0) Resolution: Fixed As mentioned above, I marked nightly for now. I need to go to the beer store if I'm gonna touch MockDirectoryWrapper... > TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes > --- > > Key: LUCENE-9189 > URL: https://issues.apache.org/jira/browse/LUCENE-9189 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > > I thought it was just the testUpdatesOnDiskFull, but looks like this one > needs to be nightly too. > Should look more into the test, but I know something causes it to make such > an insane amount of files, that sorting them becomes a bottleneck. > I guess also related is that it would be great if MockDirectoryWrapper's disk > full check didn't trigger a sort of the files (via listAll): it does this > check on like every i/o, would be nice for it to be less absurd. Maybe > instead the test could check for disk full on not every i/o but some random > sample of them? > Temporarily lets make it nightly... > {noformat} > PROFILE SUMMARY from 182501 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 15.89%28995 java.lang.StringLatin1#compareTo() > 6.61% 12069 java.util.TimSort#mergeHi() > 5.96% 10878 java.util.TimSort#binarySort() > 3.41% 6231java.util.concurrent.ConcurrentHashMap#tabAt() > 2.98% 5433java.util.Comparators$NaturalOrderComparator#compare() > 2.12% 3876org.apache.lucene.store.DataOutput#copyBytes() > 2.03% 3712java.lang.String#compareTo() > 1.84% 3350java.util.concurrent.ConcurrentHashMap#get() > 1.83% 3337java.util.TimSort#mergeLo() > 1.67% 3047java.util.ArrayList#add() > {noformat} > All the file sorting is called from stacks like this, so its literally > happening every writeByte() and so on > {noformat} > 0.73% 1329java.util.TimSort#binarySort() > at java.util.TimSort#sort() > at java.util.Arrays#sort() > at java.util.ArrayList#sort() > at java.util.stream.SortedOps$RefSortingSink#end() > at java.util.stream.AbstractPipeline#copyInto() > at java.util.stream.AbstractPipeline#wrapAndCopyInto() > at java.util.stream.AbstractPipeline#evaluate() > at > java.util.stream.AbstractPipeline#evaluateToArrayNode() > at java.util.stream.ReferencePipeline#toArray() > at > org.apache.lucene.store.ByteBuffersDirectory#listAll() > at > org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeByte() > at org.apache.lucene.store.DataOutput#writeInt() > at org.apache.lucene.codecs.CodecUtil#writeFooter() > at > org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.index.PendingDeletes#writeLiveDocs() > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025313#comment-17025313 ] Michael Froh commented on LUCENE-8962: -- Thanks [~msoko...@gmail.com] for the feedback on the PR! I've updated it to incorporate your suggestions. > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 1h 40m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on a change in pull request #1155: LUCENE-8962: Add ability to selectively merge on commit
msokolov commented on a change in pull request #1155: LUCENE-8962: Add ability to selectively merge on commit URL: https://github.com/apache/lucene-solr/pull/1155#discussion_r371953236 ## File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java ## @@ -3223,15 +3259,44 @@ private long prepareCommitInternal() throws IOException { // sneak into the commit point: toCommit = segmentInfos.clone(); + if (anyChanges) { +mergeAwaitLatchRef = new AtomicReference<>(); +MergePolicy mergeOnCommitPolicy = waitForMergeOnCommitPolicy(config.getMergePolicy(), toCommit, mergeAwaitLatchRef); + +// Find any merges that can execute on commit (per MergePolicy). +commitMerges = mergeOnCommitPolicy.findCommitMerges(segmentInfos, this); +if (commitMerges != null && commitMerges.merges.size() > 0) { + int mergeCount = 0; + for (MergePolicy.OneMerge oneMerge : commitMerges.merges) { +if (registerMerge(oneMerge)) { + mergeCount++; +} else { + throw new IllegalStateException("MergePolicy " + config.getMergePolicy().getClass() + Review comment: I see, thanks for explaining! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4702) Terms dictionary compression
[ https://issues.apache.org/jira/browse/LUCENE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025321#comment-17025321 ] ASF subversion and git services commented on LUCENE-4702: - Commit 6eb8834a57fa176c6c2e995480b69ecea1b6bd07 in lucene-solr's branch refs/heads/master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6eb8834 ] LUCENE-4702: Reduce terms dictionary compression overhead. (#1216) Changes include: - Removed LZ4 compression of suffix lengths which didn't save much space anyway. - For stats, LZ4 was only really used for run-length compression of terms whose docFreq is 1. This has been replaced by explicit run-length compression. - Since we only use LZ4 for suffix bytes if the compression ration is < 75%, we now only try LZ4 out if the average suffix length is greater than 6, in order to reduce index-time overhead. > Terms dictionary compression > > > Key: LUCENE-4702 > URL: https://issues.apache.org/jira/browse/LUCENE-4702 > Project: Lucene - Core > Issue Type: Wish >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Trivial > Attachments: LUCENE-4702.patch, LUCENE-4702.patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > I've done a quick test with the block tree terms dictionary by replacing a > call to IndexOutput.writeBytes to write suffix bytes with a call to > LZ4.compressHC to test the peformance hit. Interestingly, search performance > was very good (see comparison table below) and the tim files were 14% smaller > (from 150432 bytes overall to 129516). > {noformat} > TaskQPS baseline StdDevQPS compressed StdDev > Pct diff > Fuzzy1 111.50 (2.0%) 78.78 (1.5%) > -29.4% ( -32% - -26%) > Fuzzy2 36.99 (2.7%) 28.59 (1.5%) > -22.7% ( -26% - -18%) > Respell 122.86 (2.1%) 103.89 (1.7%) > -15.4% ( -18% - -11%) > Wildcard 100.58 (4.3%) 94.42 (3.2%) > -6.1% ( -13% -1%) > Prefix3 124.90 (5.7%) 122.67 (4.7%) > -1.8% ( -11% -9%) >OrHighLow 169.87 (6.8%) 167.77 (8.0%) > -1.2% ( -15% - 14%) > LowTerm 1949.85 (4.5%) 1929.02 (3.4%) > -1.1% ( -8% -7%) > AndHighLow 2011.95 (3.5%) 1991.85 (3.3%) > -1.0% ( -7% -5%) > OrHighHigh 155.63 (6.7%) 154.12 (7.9%) > -1.0% ( -14% - 14%) > AndHighHigh 341.82 (1.2%) 339.49 (1.7%) > -0.7% ( -3% -2%) >OrHighMed 217.55 (6.3%) 216.16 (7.1%) > -0.6% ( -13% - 13%) > IntNRQ 53.10 (10.9%) 52.90 (8.6%) > -0.4% ( -17% - 21%) > MedTerm 998.11 (3.8%) 994.82 (5.6%) > -0.3% ( -9% -9%) > MedSpanNear 60.50 (3.7%) 60.36 (4.8%) > -0.2% ( -8% -8%) > HighSpanNear 19.74 (4.5%) 19.72 (5.1%) > -0.1% ( -9% -9%) > LowSpanNear 101.93 (3.2%) 101.82 (4.4%) > -0.1% ( -7% -7%) > AndHighMed 366.18 (1.7%) 366.93 (1.7%) > 0.2% ( -3% -3%) > PKLookup 237.28 (4.0%) 237.96 (4.2%) > 0.3% ( -7% -8%) >MedPhrase 173.17 (4.7%) 174.69 (4.7%) > 0.9% ( -8% - 10%) > LowSloppyPhrase 180.91 (2.6%) 182.79 (2.7%) > 1.0% ( -4% -6%) >LowPhrase 374.64 (5.5%) 379.11 (5.8%) > 1.2% ( -9% - 13%) > HighTerm 253.14 (7.9%) 256.97 (11.4%) > 1.5% ( -16% - 22%) > HighPhrase 19.52 (10.6%) 19.83 (11.0%) > 1.6% ( -18% - 25%) > MedSloppyPhrase 141.90 (2.6%) 144.11 (2.5%) > 1.6% ( -3% -6%) > HighSloppyPhrase 25.26 (4.8%) 25.97 (5.0%) > 2.8% ( -6% - 13%) > {noformat} > Only queries which are very terms-dictionary-intensive got a performance hit > (Fuzzy, Fuzzy2, Respell, Wildcard), other queries including Prefix3 behaved > (surprisingly) well. > Do you think of it as something worth exploring? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz merged pull request #1216: LUCENE-4702: Reduce terms dictionary compression overhead.
jpountz merged pull request #1216: LUCENE-4702: Reduce terms dictionary compression overhead. URL: https://github.com/apache/lucene-solr/pull/1216 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz merged pull request #1197: LUCENE-9161: DirectMonotonicWriter checks for overflows.
jpountz merged pull request #1197: LUCENE-9161: DirectMonotonicWriter checks for overflows. URL: https://github.com/apache/lucene-solr/pull/1197 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9161) DirectMonotonicWriter should check for overflows
[ https://issues.apache.org/jira/browse/LUCENE-9161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025344#comment-17025344 ] ASF subversion and git services commented on LUCENE-9161: - Commit 92b684c647876c886ba71dab51edf6f1f3c59d82 in lucene-solr's branch refs/heads/master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=92b684c ] LUCENE-9161: DirectMonotonicWriter checks for overflows. (#1197) > DirectMonotonicWriter should check for overflows > > > Key: LUCENE-9161 > URL: https://issues.apache.org/jira/browse/LUCENE-9161 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > DirectMonotonicWriter doesn't verify that the provided blockShift is > compatible with the number of written values. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14226) SolrStream reports AuthN/AuthZ failures (401|403) as IOException w/o details
Chris M. Hostetter created SOLR-14226: - Summary: SolrStream reports AuthN/AuthZ failures (401|403) as IOException w/o details Key: SOLR-14226 URL: https://issues.apache.org/jira/browse/SOLR-14226 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: SolrJ, streaming expressions Reporter: Chris M. Hostetter If you try to use the SolrJ {{SolrStream}} class to making a streaming expression request to a solr node, any authentication or authorization failures will be swallowed and a eneric "IOException" will be thrown. (evidently due to a pars error trying to read the body of the response w/o consulting the HTTP status?) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14226) SolrStream reports AuthN/AuthZ failures (401|403) as IOException w/o details
[ https://issues.apache.org/jira/browse/SOLR-14226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025365#comment-17025365 ] Chris M. Hostetter commented on SOLR-14226: --- >From a test i'm in the process of trying to write... {code:java} public void testEchoStreamFail() throws Exception { final SolrStream solrStream = new SolrStream(solrUrl, params("qt", "/stream", "expr", "echo(hello world)")); solrStream.setCredentials("bogus_user", "bogus_pass"); SolrException e = expectThrows(SolrException.class, () -> { final List ignored = getTuples(solrStream); }); assertEquals(401, e.code()); } {code} {noformat} [junit4]> Throwable #1: junit.framework.AssertionFailedError: Unexpected exception type, expected SolrException but got java.io.IOException: --> http://127.0.0.1:35337/solr/collection_x: An exception has occurred on the server, refer to server log for details. [junit4]>at __randomizedtesting.SeedInfo.seed([F7287DED4A9F66CA:B866576F16986894]:0) [junit4]>at org.apache.lucene.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2752) [junit4]>at org.apache.lucene.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2740) [junit4]>at org.apache.solr.client.solrj.io.stream.CloudAuthStreamTest.testEchoStreamFail(CloudAuthStreamTest.java:208) [junit4]>at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4]>at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4]>at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4]>at java.base/java.lang.reflect.Method.invoke(Method.java:566) [junit4]>at java.base/java.lang.Thread.run(Thread.java:834) [junit4]> Caused by: java.io.IOException: --> http://127.0.0.1:35337/solr/collection_x: An exception has occurred on the server, refer to server log for details. [junit4]>at org.apache.solr.client.solrj.io.stream.SolrStream.read(SolrStream.java:232) [junit4]>at org.apache.solr.client.solrj.io.stream.CloudAuthStreamTest.getTuples(CloudAuthStreamTest.java:221) [junit4]>at org.apache.solr.client.solrj.io.stream.CloudAuthStreamTest.lambda$testEchoStreamFail$3(CloudAuthStreamTest.java:209) [junit4]>at org.apache.lucene.util.LuceneTestCase._expectThrows(LuceneTestCase.java:2870) [junit4]>at org.apache.lucene.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2745) [junit4]>... 41 more [junit4]> Caused by: org.noggit.JSONParser$ParseException: JSON Parse Error: char=<,position=0 AFTER='<' BEFORE='html> https://issues.apache.org/jira/browse/SOLR-14226 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ, streaming expressions >Reporter: Chris M. Hostetter >Priority: Major > > If you try to use the SolrJ {{SolrStream}} class to making a streaming > expression request to a solr node, any authentication or authorization > failures will be swallowed and a eneric "IOException" will be thrown. > (evidently due to a pars error trying to read the body of the response w/o > consulting the HTTP status?) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025367#comment-17025367 ] David Smiley commented on LUCENE-8962: -- [~msfroh] as you can see above, I accomplished the effect here already in a different way without modifying Lucene. Not that I think we shouldn't modify Lucene altogether but I think the changes can be limited to _implementations of_ MergePolicy & MergeScheduler without needing to modify the abstractions themselves or core Lucene, which are already sufficient. See LUCENE-8331 for a benchmark utility. I should resume this work. > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 1h 50m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4702) Terms dictionary compression
[ https://issues.apache.org/jira/browse/LUCENE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025370#comment-17025370 ] ASF subversion and git services commented on LUCENE-4702: - Commit 033220e2ab31494054b26c236be4b43b777aea02 in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=033220e ] LUCENE-4702: Reduce terms dictionary compression overhead. (#1216) Changes include: - Removed LZ4 compression of suffix lengths which didn't save much space anyway. - For stats, LZ4 was only really used for run-length compression of terms whose docFreq is 1. This has been replaced by explicit run-length compression. - Since we only use LZ4 for suffix bytes if the compression ration is < 75%, we now only try LZ4 out if the average suffix length is greater than 6, in order to reduce index-time overhead. > Terms dictionary compression > > > Key: LUCENE-4702 > URL: https://issues.apache.org/jira/browse/LUCENE-4702 > Project: Lucene - Core > Issue Type: Wish >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Trivial > Attachments: LUCENE-4702.patch, LUCENE-4702.patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > I've done a quick test with the block tree terms dictionary by replacing a > call to IndexOutput.writeBytes to write suffix bytes with a call to > LZ4.compressHC to test the peformance hit. Interestingly, search performance > was very good (see comparison table below) and the tim files were 14% smaller > (from 150432 bytes overall to 129516). > {noformat} > TaskQPS baseline StdDevQPS compressed StdDev > Pct diff > Fuzzy1 111.50 (2.0%) 78.78 (1.5%) > -29.4% ( -32% - -26%) > Fuzzy2 36.99 (2.7%) 28.59 (1.5%) > -22.7% ( -26% - -18%) > Respell 122.86 (2.1%) 103.89 (1.7%) > -15.4% ( -18% - -11%) > Wildcard 100.58 (4.3%) 94.42 (3.2%) > -6.1% ( -13% -1%) > Prefix3 124.90 (5.7%) 122.67 (4.7%) > -1.8% ( -11% -9%) >OrHighLow 169.87 (6.8%) 167.77 (8.0%) > -1.2% ( -15% - 14%) > LowTerm 1949.85 (4.5%) 1929.02 (3.4%) > -1.1% ( -8% -7%) > AndHighLow 2011.95 (3.5%) 1991.85 (3.3%) > -1.0% ( -7% -5%) > OrHighHigh 155.63 (6.7%) 154.12 (7.9%) > -1.0% ( -14% - 14%) > AndHighHigh 341.82 (1.2%) 339.49 (1.7%) > -0.7% ( -3% -2%) >OrHighMed 217.55 (6.3%) 216.16 (7.1%) > -0.6% ( -13% - 13%) > IntNRQ 53.10 (10.9%) 52.90 (8.6%) > -0.4% ( -17% - 21%) > MedTerm 998.11 (3.8%) 994.82 (5.6%) > -0.3% ( -9% -9%) > MedSpanNear 60.50 (3.7%) 60.36 (4.8%) > -0.2% ( -8% -8%) > HighSpanNear 19.74 (4.5%) 19.72 (5.1%) > -0.1% ( -9% -9%) > LowSpanNear 101.93 (3.2%) 101.82 (4.4%) > -0.1% ( -7% -7%) > AndHighMed 366.18 (1.7%) 366.93 (1.7%) > 0.2% ( -3% -3%) > PKLookup 237.28 (4.0%) 237.96 (4.2%) > 0.3% ( -7% -8%) >MedPhrase 173.17 (4.7%) 174.69 (4.7%) > 0.9% ( -8% - 10%) > LowSloppyPhrase 180.91 (2.6%) 182.79 (2.7%) > 1.0% ( -4% -6%) >LowPhrase 374.64 (5.5%) 379.11 (5.8%) > 1.2% ( -9% - 13%) > HighTerm 253.14 (7.9%) 256.97 (11.4%) > 1.5% ( -16% - 22%) > HighPhrase 19.52 (10.6%) 19.83 (11.0%) > 1.6% ( -18% - 25%) > MedSloppyPhrase 141.90 (2.6%) 144.11 (2.5%) > 1.6% ( -3% -6%) > HighSloppyPhrase 25.26 (4.8%) 25.97 (5.0%) > 2.8% ( -6% - 13%) > {noformat} > Only queries which are very terms-dictionary-intensive got a performance hit > (Fuzzy, Fuzzy2, Respell, Wildcard), other queries including Prefix3 behaved > (surprisingly) well. > Do you think of it as something worth exploring? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org