[jira] [Comment Edited] (SOLR-14201) some SolrCore are not released after being removed

2020-01-28 Thread Vinh Le (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020732#comment-17020732
 ] 

Vinh Le edited comment on SOLR-14201 at 1/28/20 8:03 AM:
-

Thanks [~cpoerschke]

To reproduce this issue, just keep creating new collections

 
{code:java}
while true; do ./import.sh; sleep 10; done
#import.sh
#!/bin/bash -e
HOST=http://localhost:8983/solr
PREV_COLLECTION=$(http "$HOST/admin/collections?action=LISTALIASES" | jq -r 
".aliases.SGFAS")
COLLECTION="next_$(gdate +%H%M%S)"

echo "Create new collection = $COLLECTION"
http POST 
"$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1"

echo "Push data to new collection"
cat docs.xml | http POST 
"$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" 
"Content-Type: text/xml"

echo "Optimize"
http "$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false"

echo "Update alias"
http 
"$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS"

echo "Delete previous collection = $PREV_COLLECTION"
http "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION"
{code}
 

I also tried to remove all plugins, but the issue still persists.

 Classes.loaded keeps increasing.
{code:java}
❯ http "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'
8428

❯ http "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'
9323
{code}
!image-2020-01-22-10-39-15-301.png|width=759,height=606!

!image-2020-01-22-10-42-17-511.png!

  !image-2020-01-22-12-28-46-241.png!
  
   And VisualVM graphs

!image-2020-01-22-14-45-52-730.png|width=966,height=677!

I'm not really familiar with Java, but looks like this is related to finalizers.

 

 


was (Author: vinhlh):
Thanks [~cpoerschke]

To reproduce this issue, just keep creating new collections

 
{code:java}
while true; do ./import.sh; sleep 10; done
#import.sh
#!/bin/bash -e
HOST=http://localhost:8983/solr
PREV_COLLECTION=$(http "$HOST/admin/collections?action=LISTALIASES" | jq -r 
".aliases.SGFAS")
COLLECTION="next_$(gdate +%H%M%S)"
# COLLECTION="next_1029"
echo "Create new collection = $COLLECTION"
http POST 
"$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1"
echo "Push data to new collection"
cat docs.xml | http POST 
"$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" 
"Content-Type: text/xml"
echo "Optimize"
http "$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false"
echo "Update alias"
http 
"$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS"
echo "Delete previous collection = $PREV_COLLECTION"
http "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION"
{code}
 

I also tried to remove all plugins, but the issue still persists.

 Classes.loaded keeps increasing.
{code:java}
❯ http "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'
8428

❯ http "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'
9323
{code}
!image-2020-01-22-10-39-15-301.png|width=759,height=606!

!image-2020-01-22-10-42-17-511.png!

  !image-2020-01-22-12-28-46-241.png!
  
  And VisualVM graphs

!image-2020-01-22-14-45-52-730.png|width=966,height=677!

I'm not really familiar with Java, but looks like this is related to finalizers.

 

 

> some SolrCore are not released after being removed
> --
>
> Key: SOLR-14201
> URL: https://issues.apache.org/jira/browse/SOLR-14201
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: image-2020-01-22-10-39-15-301.png, 
> image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, 
> image-2020-01-22-14-45-52-730.png
>
>
> [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0):
> bq. In 7.7.2, some SolrCore still are not released after being removed.
> https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png
> Starting this ticket for a separate investigation and fix. A next 
> investigative step could be to try and reproduce the issue on the latest 8.x 
> release.
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14201) some SolrCore are not released after being removed

2020-01-28 Thread Vinh Le (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020732#comment-17020732
 ] 

Vinh Le edited comment on SOLR-14201 at 1/28/20 8:04 AM:
-

Thanks [~cpoerschke]

To reproduce this issue, just keep creating new collections

 
{code:java}
while true; do ./import.sh; sleep 10; done

#import.sh
#!/bin/bash -e
HOST=http://localhost:8983/solr
PREV_COLLECTION=$(http "$HOST/admin/collections?action=LISTALIASES" | jq -r 
".aliases.SGFAS")
COLLECTION="next_$(gdate +%H%M%S)"

echo "Create new collection = $COLLECTION"
http POST 
"$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1"

echo "Push data to new collection"
cat docs.xml | http POST 
"$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" 
"Content-Type: text/xml"

echo "Optimize"
http "$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false"

echo "Update alias"
http 
"$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS"

echo "Delete previous collection = $PREV_COLLECTION"
http "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION"
{code}
 

I also tried to remove all plugins, but the issue still persists.

 Classes.loaded keeps increasing.
{code:java}
❯ http "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'
8428

❯ http "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'
9323
{code}
!image-2020-01-22-10-39-15-301.png|width=759,height=606!

!image-2020-01-22-10-42-17-511.png!

  !image-2020-01-22-12-28-46-241.png!
  
   And VisualVM graphs

!image-2020-01-22-14-45-52-730.png|width=966,height=677!

I'm not really familiar with Java, but looks like this is related to finalizers.

 

 


was (Author: vinhlh):
Thanks [~cpoerschke]

To reproduce this issue, just keep creating new collections

 
{code:java}
while true; do ./import.sh; sleep 10; done
#import.sh
#!/bin/bash -e
HOST=http://localhost:8983/solr
PREV_COLLECTION=$(http "$HOST/admin/collections?action=LISTALIASES" | jq -r 
".aliases.SGFAS")
COLLECTION="next_$(gdate +%H%M%S)"

echo "Create new collection = $COLLECTION"
http POST 
"$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1"

echo "Push data to new collection"
cat docs.xml | http POST 
"$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" 
"Content-Type: text/xml"

echo "Optimize"
http "$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false"

echo "Update alias"
http 
"$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS"

echo "Delete previous collection = $PREV_COLLECTION"
http "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION"
{code}
 

I also tried to remove all plugins, but the issue still persists.

 Classes.loaded keeps increasing.
{code:java}
❯ http "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'
8428

❯ http "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'
9323
{code}
!image-2020-01-22-10-39-15-301.png|width=759,height=606!

!image-2020-01-22-10-42-17-511.png!

  !image-2020-01-22-12-28-46-241.png!
  
   And VisualVM graphs

!image-2020-01-22-14-45-52-730.png|width=966,height=677!

I'm not really familiar with Java, but looks like this is related to finalizers.

 

 

> some SolrCore are not released after being removed
> --
>
> Key: SOLR-14201
> URL: https://issues.apache.org/jira/browse/SOLR-14201
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: image-2020-01-22-10-39-15-301.png, 
> image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, 
> image-2020-01-22-14-45-52-730.png
>
>
> [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0):
> bq. In 7.7.2, some SolrCore still are not released after being removed.
> https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png
> Starting this ticket for a separate investigation and fix. A next 
> investigative step could be to try and reproduce the issue on the latest 8.x 
> release.
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1217: SOLR-14223 PublicKeyHandler consumes a lot of entropy during tests

2020-01-28 Thread GitBox
dweiss commented on a change in pull request #1217: SOLR-14223 PublicKeyHandler 
consumes a lot of entropy during tests
URL: https://github.com/apache/lucene-solr/pull/1217#discussion_r371655634
 
 

 ##
 File path: 
solr/test-framework/src/java/org/apache/solr/util/NotSecurePsuedoRandom.java
 ##
 @@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.util;
+
+import java.security.SecureRandom;
+import java.security.SecureRandomSpi;
+import java.util.Random;
+
+/**
+ * A mocked up instance of SecureRandom that just uses {@link Random} under 
the covers.
+ * This is to prevent blocking issues that arise in platform default
+ * SecureRandom instances due to too many instances / not enough random 
entropy.
+ * Tests do not need secure SSL.
+ */
+public class NotSecurePsuedoRandom extends SecureRandom {
+  public static final SecureRandom INSTANCE = new NotSecurePsuedoRandom();
+  private static final Random RAND = new Random(42);
 
 Review comment:
   This must *not* be static or shared across test instances. A better solution 
would be to create this off an initial long seed and this seed should be taken 
from RandomizedContext.current..().random().nextLong().


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14201) some SolrCore are not released after being removed

2020-01-28 Thread Vinh Le (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinh Le updated SOLR-14201:
---
Attachment: image-2020-01-28-16-17-44-030.png

> some SolrCore are not released after being removed
> --
>
> Key: SOLR-14201
> URL: https://issues.apache.org/jira/browse/SOLR-14201
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: image-2020-01-22-10-39-15-301.png, 
> image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, 
> image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png
>
>
> [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0):
> bq. In 7.7.2, some SolrCore still are not released after being removed.
> https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png
> Starting this ticket for a separate investigation and fix. A next 
> investigative step could be to try and reproduce the issue on the latest 8.x 
> release.
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14201) some SolrCore are not released after being removed

2020-01-28 Thread Vinh Le (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinh Le updated SOLR-14201:
---
Attachment: image-2020-01-28-16-19-43-760.png

> some SolrCore are not released after being removed
> --
>
> Key: SOLR-14201
> URL: https://issues.apache.org/jira/browse/SOLR-14201
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: image-2020-01-22-10-39-15-301.png, 
> image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, 
> image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, 
> image-2020-01-28-16-19-43-760.png
>
>
> [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0):
> bq. In 7.7.2, some SolrCore still are not released after being removed.
> https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png
> Starting this ticket for a separate investigation and fix. A next 
> investigative step could be to try and reproduce the issue on the latest 8.x 
> release.
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1218: Javacc erick

2020-01-28 Thread GitBox
dweiss commented on a change in pull request #1218: Javacc erick
URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371656425
 
 

 ##
 File path: gradle/defaults-java.gradle
 ##
 @@ -25,13 +25,13 @@ allprojects {
 tasks.withType(JavaCompile) {
   options.encoding = "UTF-8"
   options.compilerArgs += [
-"-Xlint", 
 
 Review comment:
   Are these differences on EOLs? I think they've been normalized recently so 
it should be LF.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1218: Javacc erick

2020-01-28 Thread GitBox
dweiss commented on a change in pull request #1218: Javacc erick
URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371657147
 
 

 ##
 File path: gradle/generation/javacc.gradle
 ##
 @@ -0,0 +1,102 @@
+// Add a top-level pseudo-task to which we will attach individual regenerate 
tasks.
+import static groovy.io.FileType.*
+
+configure(rootProject) {
+  configurations {
+javacc
+  }
+
+  dependencies {
+javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}"
+  }
+
+  task javacc() {
+description "Regenerate sources for corresponding javacc grammar files."
+group "generation"
+
+dependsOn ":lucene:queryparser:javaccParserClassic"
+dependsOn ":lucene:queryparser:javaccParserSurround"
+dependsOn ":lucene:queryparser:javaccParserFlexible"
+  }
+}
+
+// We always regenerate, no need to declare outputs.
+class JavaCCTask extends DefaultTask {
+  @Input
+  File javaccFile
+
+  JavaCCTask() {
+dependsOn(project.rootProject.configurations.javacc)
+  }
+
+  @TaskAction
+  def generate() {
+if (!javaccFile || !javaccFile.exists()) {
+  throw new RuntimeException("JavaCC input file does not exist: 
${javaccFile}")
+}
+// Remove old files so we can regenerate them
+def parentDir = javaccFile.parentFile
+parentDir.eachFileMatch FILES, ~/.*\.java/, { file ->
+  if (file.text.contains("Generated By:JavaCC")) {
+file.delete()
+  }
+}
+logger.lifecycle("Regenerating JavaCC:\n  from: ${javaccFile}\nto: 
${parentDir}")
+
+project.javaexec {
+  classpath {
+project.rootProject.configurations.javacc
+  }
+  main = "org.javacc.parser.Main"
+  args += "-OUTPUT_DIRECTORY=${parentDir}"
+  args += [javaccFile]
 
 Review comment:
   no need for array around javaccFile?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14201) some SolrCore are not released after being removed

2020-01-28 Thread Vinh Le (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinh Le updated SOLR-14201:
---
Attachment: image-2020-01-28-16-20-50-709.png

> some SolrCore are not released after being removed
> --
>
> Key: SOLR-14201
> URL: https://issues.apache.org/jira/browse/SOLR-14201
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: image-2020-01-22-10-39-15-301.png, 
> image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, 
> image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, 
> image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png
>
>
> [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0):
> bq. In 7.7.2, some SolrCore still are not released after being removed.
> https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png
> Starting this ticket for a separate investigation and fix. A next 
> investigative step could be to try and reproduce the issue on the latest 8.x 
> release.
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1218: Javacc erick

2020-01-28 Thread GitBox
dweiss commented on a change in pull request #1218: Javacc erick
URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371657030
 
 

 ##
 File path: gradle/generation/javacc.gradle
 ##
 @@ -0,0 +1,102 @@
+// Add a top-level pseudo-task to which we will attach individual regenerate 
tasks.
+import static groovy.io.FileType.*
+
+configure(rootProject) {
+  configurations {
+javacc
+  }
+
+  dependencies {
+javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}"
+  }
+
+  task javacc() {
+description "Regenerate sources for corresponding javacc grammar files."
+group "generation"
+
+dependsOn ":lucene:queryparser:javaccParserClassic"
+dependsOn ":lucene:queryparser:javaccParserSurround"
+dependsOn ":lucene:queryparser:javaccParserFlexible"
+  }
+}
+
+// We always regenerate, no need to declare outputs.
+class JavaCCTask extends DefaultTask {
+  @Input
+  File javaccFile
+
+  JavaCCTask() {
+dependsOn(project.rootProject.configurations.javacc)
+  }
+
+  @TaskAction
+  def generate() {
+if (!javaccFile || !javaccFile.exists()) {
+  throw new RuntimeException("JavaCC input file does not exist: 
${javaccFile}")
+}
+// Remove old files so we can regenerate them
+def parentDir = javaccFile.parentFile
+parentDir.eachFileMatch FILES, ~/.*\.java/, { file ->
 
 Review comment:
   Oh, what's FILES -- I don't know this construct.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1218: Javacc erick

2020-01-28 Thread GitBox
dweiss commented on a change in pull request #1218: Javacc erick
URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371657978
 
 

 ##
 File path: gradle/generation/javacc.gradle
 ##
 @@ -0,0 +1,102 @@
+// Add a top-level pseudo-task to which we will attach individual regenerate 
tasks.
+import static groovy.io.FileType.*
+
+configure(rootProject) {
+  configurations {
+javacc
+  }
+
+  dependencies {
+javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}"
+  }
+
+  task javacc() {
+description "Regenerate sources for corresponding javacc grammar files."
+group "generation"
+
+dependsOn ":lucene:queryparser:javaccParserClassic"
+dependsOn ":lucene:queryparser:javaccParserSurround"
+dependsOn ":lucene:queryparser:javaccParserFlexible"
+  }
+}
+
+// We always regenerate, no need to declare outputs.
+class JavaCCTask extends DefaultTask {
+  @Input
+  File javaccFile
+
+  JavaCCTask() {
+dependsOn(project.rootProject.configurations.javacc)
+  }
+
+  @TaskAction
+  def generate() {
+if (!javaccFile || !javaccFile.exists()) {
+  throw new RuntimeException("JavaCC input file does not exist: 
${javaccFile}")
+}
+// Remove old files so we can regenerate them
+def parentDir = javaccFile.parentFile
+parentDir.eachFileMatch FILES, ~/.*\.java/, { file ->
+  if (file.text.contains("Generated By:JavaCC")) {
+file.delete()
+  }
+}
+logger.lifecycle("Regenerating JavaCC:\n  from: ${javaccFile}\nto: 
${parentDir}")
+
+project.javaexec {
+  classpath {
+project.rootProject.configurations.javacc
+  }
+  main = "org.javacc.parser.Main"
+  args += "-OUTPUT_DIRECTORY=${parentDir}"
+  args += [javaccFile]
+}
+  }
+}
+
+
+configure(project(":lucene:queryparser")) {
+  task javaccParserClassic(type: JavaCCTask) {
+description "Regenerate classic query parser from java CC.java"
+group "generation"
+
+javaccFile = 
file('src/java/org/apache/lucene/queryparser/classic/QueryParser.jj')
+def parent = javaccFile.parentFile.toString() // I'll need this later.
+
+doLast {
+  // There'll be a lot of cleanup in here to get precommits and builds to 
pass, but as long as we don't
 
 Review comment:
   I think it'd be ideal to regenerate with ant first (to eliminate any 
overlays that have accumulated),  commit that, then regenerate with gradle. 
With any local patches applied the result should be identical -- that's how 
you'll know the process is the same as with ant (git diff should be empty)?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1218: Javacc erick

2020-01-28 Thread GitBox
dweiss commented on a change in pull request #1218: Javacc erick
URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371656935
 
 

 ##
 File path: gradle/generation/javacc.gradle
 ##
 @@ -0,0 +1,102 @@
+// Add a top-level pseudo-task to which we will attach individual regenerate 
tasks.
+import static groovy.io.FileType.*
+
+configure(rootProject) {
+  configurations {
+javacc
+  }
+
+  dependencies {
+javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}"
+  }
+
+  task javacc() {
+description "Regenerate sources for corresponding javacc grammar files."
+group "generation"
+
+dependsOn ":lucene:queryparser:javaccParserClassic"
+dependsOn ":lucene:queryparser:javaccParserSurround"
+dependsOn ":lucene:queryparser:javaccParserFlexible"
+  }
+}
+
+// We always regenerate, no need to declare outputs.
+class JavaCCTask extends DefaultTask {
+  @Input
+  File javaccFile
+
+  JavaCCTask() {
+dependsOn(project.rootProject.configurations.javacc)
+  }
+
+  @TaskAction
+  def generate() {
+if (!javaccFile || !javaccFile.exists()) {
+  throw new RuntimeException("JavaCC input file does not exist: 
${javaccFile}")
+}
+// Remove old files so we can regenerate them
+def parentDir = javaccFile.parentFile
+parentDir.eachFileMatch FILES, ~/.*\.java/, { file ->
 
 Review comment:
   If these files get overwritten I don't think we should care about explicit 
deletions (?).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14201) some SolrCore are not released after being removed

2020-01-28 Thread Vinh Le (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024957#comment-17024957
 ] 

Vinh Le commented on SOLR-14201:


hmm, interesting, I assumed that optimize is synchronous (as have to wait). Let 
me try each case:
h3. *Removing optimize step*

**It seems classes loaded still keeps increasing 
!image-2020-01-28-16-17-44-030.png|width=540,height=389!

 

One thing I forgot to mention is, if I click on "Perform GC" button in 
VisualVM, to trigger GC manually.

!image-2020-01-28-16-19-43-760.png|width=329,height=106!

Immediately, we can see a drop in classes loaded in VisualVM UI, 

!image-2020-01-28-16-20-50-709.png|width=915,height=544!

(at 4:20PM)

But checking via Metrics APIs
{quote}http --timeout=300 "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'

9635
{quote}
It remains the same.

 

 

> some SolrCore are not released after being removed
> --
>
> Key: SOLR-14201
> URL: https://issues.apache.org/jira/browse/SOLR-14201
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: image-2020-01-22-10-39-15-301.png, 
> image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, 
> image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, 
> image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png
>
>
> [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0):
> bq. In 7.7.2, some SolrCore still are not released after being removed.
> https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png
> Starting this ticket for a separate investigation and fix. A next 
> investigative step could be to try and reproduce the issue on the latest 8.x 
> release.
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14201) some SolrCore are not released after being removed

2020-01-28 Thread Vinh Le (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinh Le updated SOLR-14201:
---
Attachment: image-2020-01-28-16-59-51-813.png

> some SolrCore are not released after being removed
> --
>
> Key: SOLR-14201
> URL: https://issues.apache.org/jira/browse/SOLR-14201
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: image-2020-01-22-10-39-15-301.png, 
> image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, 
> image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, 
> image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png, 
> image-2020-01-28-16-59-51-813.png
>
>
> [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0):
> bq. In 7.7.2, some SolrCore still are not released after being removed.
> https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png
> Starting this ticket for a separate investigation and fix. A next 
> investigative step could be to try and reproduce the issue on the latest 8.x 
> release.
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14201) some SolrCore are not released after being removed

2020-01-28 Thread Vinh Le (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024969#comment-17024969
 ] 

Vinh Le commented on SOLR-14201:


h3. Without both optimize and alias
{quote}HOST=http://localhost:8983/solr

# Base on alias
# PREV_COLLECTION=$(http --timeout=300 
"$HOST/admin/collections?action=LISTALIASES" | jq -r ".aliases.SGFAS")
# Base on last collection
PREV_COLLECTION=$(http --timeout=300 "$HOST/admin/collections?action=LIST" | jq 
-r ".collections[0]")

COLLECTION="next_$(gdate +%H%M%S)"
# COLLECTION="next_1029"
echo "Create new collection = $COLLECTION"
http --timeout=300 POST 
"$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1"

echo "Push data to new collection"
cat docs.xml | http --timeout=300 POST 
"$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" 
"Content-Type: text/xml"

# echo "Optimize"
# http --timeout=300 
"$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false"

# echo "Update alias"
# http --timeout=300 
"$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS"

echo "Delete previous collection = $PREV_COLLECTION"
http --timeout=300 "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION"

echo "Classes.loaded"
http --timeout=300 "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'{quote}
 
 
Basically, just remove the previous collection after creating a new one.
 
!image-2020-01-28-16-59-51-813.png|width=853,height=645!
 
Classes loaded still keeps increasing.
 
 

> some SolrCore are not released after being removed
> --
>
> Key: SOLR-14201
> URL: https://issues.apache.org/jira/browse/SOLR-14201
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: image-2020-01-22-10-39-15-301.png, 
> image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, 
> image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, 
> image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png, 
> image-2020-01-28-16-59-51-813.png
>
>
> [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0):
> bq. In 7.7.2, some SolrCore still are not released after being removed.
> https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png
> Starting this ticket for a separate investigation and fix. A next 
> investigative step could be to try and reproduce the issue on the latest 8.x 
> release.
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14201) some SolrCore are not released after being removed

2020-01-28 Thread Vinh Le (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024969#comment-17024969
 ] 

Vinh Le edited comment on SOLR-14201 at 1/28/20 9:01 AM:
-

h3. Without both optimize and alias
{code:java}
// code placeholder
{code}
 

Basically, just remove the previous collection after creating a new one.
  
 !image-2020-01-28-16-59-51-813.png|width=853,height=645!
  
 Classes loaded still keeps increasing.
  
  


was (Author: vinhlh):
h3. Without both optimize and alias
{quote}HOST=http://localhost:8983/solr

# Base on alias
# PREV_COLLECTION=$(http --timeout=300 
"$HOST/admin/collections?action=LISTALIASES" | jq -r ".aliases.SGFAS")
# Base on last collection
PREV_COLLECTION=$(http --timeout=300 "$HOST/admin/collections?action=LIST" | jq 
-r ".collections[0]")

COLLECTION="next_$(gdate +%H%M%S)"
# COLLECTION="next_1029"
echo "Create new collection = $COLLECTION"
http --timeout=300 POST 
"$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1"

echo "Push data to new collection"
cat docs.xml | http --timeout=300 POST 
"$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" 
"Content-Type: text/xml"

# echo "Optimize"
# http --timeout=300 
"$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false"

# echo "Update alias"
# http --timeout=300 
"$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS"

echo "Delete previous collection = $PREV_COLLECTION"
http --timeout=300 "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION"

echo "Classes.loaded"
http --timeout=300 "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'{quote}
 
 
Basically, just remove the previous collection after creating a new one.
 
!image-2020-01-28-16-59-51-813.png|width=853,height=645!
 
Classes loaded still keeps increasing.
 
 

> some SolrCore are not released after being removed
> --
>
> Key: SOLR-14201
> URL: https://issues.apache.org/jira/browse/SOLR-14201
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: image-2020-01-22-10-39-15-301.png, 
> image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, 
> image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, 
> image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png, 
> image-2020-01-28-16-59-51-813.png
>
>
> [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0):
> bq. In 7.7.2, some SolrCore still are not released after being removed.
> https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png
> Starting this ticket for a separate investigation and fix. A next 
> investigative step could be to try and reproduce the issue on the latest 8.x 
> release.
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14201) some SolrCore are not released after being removed

2020-01-28 Thread Vinh Le (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024969#comment-17024969
 ] 

Vinh Le edited comment on SOLR-14201 at 1/28/20 9:02 AM:
-

h3. Without both optimize and alias
{code:java}
#!/bin/bash -e
HOST=http://localhost:8983/solr
# Base on alias
# PREV_COLLECTION=$(http --timeout=300 
"$HOST/admin/collections?action=LISTALIASES" | jq -r ".aliases.SGFAS")
# Base on last collection
PREV_COLLECTION=$(http --timeout=300 "$HOST/admin/collections?action=LIST" | jq 
-r ".collections[0]")COLLECTION="next_$(gdate +%H%M%S)"
# COLLECTION="next_1029"

echo "Create new collection = $COLLECTION"
http --timeout=300 POST 
"$HOST/admin/collections?action=CREATE&name=$COLLECTION&collection.configName=seafas&numShards=1"

echo "Push data to new collection"
cat docs.xml | http --timeout=300 POST 
"$HOST/$COLLECTION/update?commitWithin=1000&overwrite=true&wt=json" 
"Content-Type: text/xml"

# echo "Optimize"
# http --timeout=300 
"$HOST/$COLLECTION/update?optimize=true&maxSegments=1&waitSearcher=false"
# echo "Update alias"
# http --timeout=300 
"$HOST/admin/collections?action=CREATEALIAS&collections=$COLLECTION&name=SGFAS"

echo "Delete previous collection = $PREV_COLLECTION"
http --timeout=300 "$HOST/admin/collections?action=DELETE&name=$PREV_COLLECTION"

echo "Classes.loaded"
http --timeout=300 "http://localhost:8983/solr/admin/metrics"; | jq 
'.metrics."solr.jvm"."classes.loaded"'

{code}
 

Basically, just remove the previous collection after creating a new one.
  
 !image-2020-01-28-16-59-51-813.png|width=853,height=645!
  
 Classes loaded still keeps increasing.
  
  


was (Author: vinhlh):
h3. Without both optimize and alias
{code:java}
// code placeholder
{code}
 

Basically, just remove the previous collection after creating a new one.
  
 !image-2020-01-28-16-59-51-813.png|width=853,height=645!
  
 Classes loaded still keeps increasing.
  
  

> some SolrCore are not released after being removed
> --
>
> Key: SOLR-14201
> URL: https://issues.apache.org/jira/browse/SOLR-14201
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: image-2020-01-22-10-39-15-301.png, 
> image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, 
> image-2020-01-22-14-45-52-730.png, image-2020-01-28-16-17-44-030.png, 
> image-2020-01-28-16-19-43-760.png, image-2020-01-28-16-20-50-709.png, 
> image-2020-01-28-16-59-51-813.png
>
>
> [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0):
> bq. In 7.7.2, some SolrCore still are not released after being removed.
> https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png
> Starting this ticket for a separate investigation and fix. A next 
> investigative step could be to try and reproduce the issue on the latest 8.x 
> release.
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-14224.

Resolution: Invalid

Please do not use Jira as a question asking forum. Use the mailing list 
[solr-u...@lucene.apache.org|mailto:solr-u...@lucene.apache.org] to ask such 
questions I'm closing this as invalid.

> Not able to build solr 6.6.2 from source after January 15, 2020
> ---
>
> Key: SOLR-14224
> URL: https://issues.apache.org/jira/browse/SOLR-14224
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.2
>Reporter: Guruprasad K K
>Priority: Major
>
> After Jan 15th maven is allowing only https connections to repo. But solr 
> 6.6.2 version uses http connection. So our builds are failing.
> But looks like latest version of solr has the fix to this in common_build.xml 
> and other places where it uses https connection to maven.
> What is the work around for this if we cant upgrade the solr version and 
> still if we want to use 6.6.2?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] andywebb1975 commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse

2020-01-28 Thread GitBox
andywebb1975 commented on a change in pull request #1210: SOLR-14219 force 
serialVersionUID of OverseerSolrResponse
URL: https://github.com/apache/lucene-solr/pull/1210#discussion_r371717458
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java
 ##
 @@ -26,7 +26,9 @@
 import java.util.Objects;
 
 public class OverseerSolrResponse extends SolrResponse {
-  
+ 
+  private static final long serialVersionUID = 4721653044098960880L;
 
 Review comment:
   hi Tomas,
   
   I still think it'd be better to set the serialVersionUID. It's _possible_ 
(though I think unlikely*) that there are systems where the previous (computed) 
value is different to `472165...`, but the new computed value (with the change 
to the class) would be different anyway, so either way they'll see an 
incompatibility. On systems using the standard build, we can make the new class 
backwards-compatible by adding serialVersionUID. It's guaranteed to be 
incompatible for everyone if we don't.
   
   Andy
   
   \* My reading of 
https://docs.oracle.com/javase/7/docs/platform/serialization/spec/class.html is 
that the computed UID is independent of the Java version, and that it should be 
set to the previously-computed value in version 2+ of a class.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Reopened] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020

2020-01-28 Thread Guruprasad K K (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guruprasad K K reopened SOLR-14224:
---

This is not a question. This is a bug after jan 15th

> Not able to build solr 6.6.2 from source after January 15, 2020
> ---
>
> Key: SOLR-14224
> URL: https://issues.apache.org/jira/browse/SOLR-14224
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.2
>Reporter: Guruprasad K K
>Priority: Major
>
> After Jan 15th maven is allowing only https connections to repo. But solr 
> 6.6.2 version uses http connection. So our builds are failing.
> But looks like latest version of solr has the fix to this in common_build.xml 
> and other places where it uses https connection to maven.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020

2020-01-28 Thread Guruprasad K K (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guruprasad K K updated SOLR-14224:
--
Description: 
After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 
version uses http connection. So our builds are failing.

But looks like latest version of solr has the fix to this in common_build.xml 
and other places where it uses https connection to maven.

  was:
After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 
version uses http connection. So our builds are failing.

But looks like latest version of solr has the fix to this in common_build.xml 
and other places where it uses https connection to maven.

What is the work around for this if we cant upgrade the solr version and still 
if we want to use 6.6.2?


> Not able to build solr 6.6.2 from source after January 15, 2020
> ---
>
> Key: SOLR-14224
> URL: https://issues.apache.org/jira/browse/SOLR-14224
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.2
>Reporter: Guruprasad K K
>Priority: Major
>
> After Jan 15th maven is allowing only https connections to repo. But solr 
> 6.6.2 version uses http connection. So our builds are failing.
> But looks like latest version of solr has the fix to this in common_build.xml 
> and other places where it uses https connection to maven.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020

2020-01-28 Thread Guruprasad K K (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guruprasad K K updated SOLR-14224:
--
Description: 
After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 
version uses http connection. So builds are failing.

But looks like latest version of solr has the fix to this in common_build.xml 
and other places where it uses https connection to maven.

  was:
After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 
version uses http connection. So our builds are failing.

But looks like latest version of solr has the fix to this in common_build.xml 
and other places where it uses https connection to maven.


> Not able to build solr 6.6.2 from source after January 15, 2020
> ---
>
> Key: SOLR-14224
> URL: https://issues.apache.org/jira/browse/SOLR-14224
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.2
>Reporter: Guruprasad K K
>Priority: Major
>
> After Jan 15th maven is allowing only https connections to repo. But solr 
> 6.6.2 version uses http connection. So builds are failing.
> But looks like latest version of solr has the fix to this in common_build.xml 
> and other places where it uses https connection to maven.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9185:
---

 Summary: add "tests.profile" to gradle build to aid fixing slow 
tests
 Key: LUCENE-9185
 URL: https://issues.apache.org/jira/browse/LUCENE-9185
 Project: Lucene - Core
  Issue Type: Task
Reporter: Robert Muir
 Attachments: LUCENE-9185.patch

It is kind of a hassle to profile slow tests to fix the bottlenecks

The idea here is to make it dead easy to profile (just) the tests, capturing 
samples at a very low granularity, reducing noise as much as possible (e.g. not 
profiling entire gradle build or anything) and print a simple report for quick 
iterating.

Here's a prototype of what I hacked together:

All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
{noformat}
...
PROFILE SUMMARY from 122464 samples
  tests.profile.count=10
  tests.profile.stacksize=1
  tests.profile.linenumbers=false
PERCENT SAMPLES STACK
2.59%   3170
org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
2.26%   2762java.util.Arrays#fill()
1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
1.24%   1523java.util.Random#nextInt()
1.19%   1456java.lang.StringUTF16#compress()
1.08%   1319java.lang.StringLatin1#inflate()
1.00%   1228java.lang.Integer#getChars()
0.99%   1214java.util.Arrays#compareUnsigned()
0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
BUILD SUCCESSFUL in 3m 59s
{noformat}
If you look at this LZ4 assertReset method, you can see its indeed way too 
expensive, checking 64K items every time.

To dig deeper into potential problems you can pass additional parameters (all 
of them used here for demonstration):
{{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
-Dtests.profile.count=8 -Dtests.profile.stacksize=20 
-Dtests.profile.linenumbers=true}}

This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...

{noformat}
...
PROFILE SUMMARY from 21355 samples
  tests.profile.count=8
  tests.profile.stacksize=20
  tests.profile.linenumbers=true
PERCENT SAMPLES STACK
26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
  at sun.nio.ch.EPollSelectorImpl#doSelect():120
  at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
  at sun.nio.ch.SelectorImpl#select():141
  at 
org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
  at 
org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
  at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
  at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
  at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
  at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
  at 
org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted 
code)
  at 
org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
  at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
  at java.lang.Thread#run():830
16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
  at sun.nio.ch.EPollSelectorImpl#doSelect():120
  at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
  at sun.nio.ch.SelectorImpl#select():141
  at 
org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
  at 
org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
  at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
  at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
  at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
  at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
  at 
org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted 
code)
  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor#lambda$execute$0():210
  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$270.1779693615#run():(Interpreted
 code)
  at 
java.util.concurrent.ThreadPoolExecutor#runWorker():1128
  at 
java.util.concurrent.ThreadPoolExecutor$Worker#run():628
  at java.lang.Thread#run():830
13.15%  2808sun.nio.ch.Net#accept():(Na

[jira] [Updated] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-9185:

Attachment: LUCENE-9185.patch

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor#lambda$execute$0():210
>

[jira] [Updated] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020

2020-01-28 Thread Guruprasad K K (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guruprasad K K updated SOLR-14224:
--
Description: 
After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 
version uses http connection. So builds are failing.

But looks like latest version of solr has the fix to this in common_build.xml 
and other places where it uses https connection to maven.

 

Error log:
ivy-bootstrap1:
[mkdir] Created dir: /root/.ant/lib
 [echo] installing ivy 2.3.0 to /root/.ant/lib
  [get] Getting: 
[http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

  [get] To: /root/.ant/lib/ivy-2.3.0.jar
  [get] Error opening connection 
[java.io|http://java.io/]
.IOException: Server returned HTTP response code: 501 for URL: 
[http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

  [get] Error opening connection 
[java.io|http://java.io/]
.IOException: Server returned HTTP response code: 501 for URL: 
[http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

  [get] Error opening connection 
[java.io|http://java.io/]
.IOException: Server returned HTTP response code: 501 for URL: 
[http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

  [get] Can't get 
[http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
 to /root/.ant/lib/ivy-2.3.0.jar

  was:
After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 
version uses http connection. So builds are failing.

But looks like latest version of solr has the fix to this in common_build.xml 
and other places where it uses https connection to maven.


> Not able to build solr 6.6.2 from source after January 15, 2020
> ---
>
> Key: SOLR-14224
> URL: https://issues.apache.org/jira/browse/SOLR-14224
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.2
>Reporter: Guruprasad K K
>Priority: Major
>
> After Jan 15th maven is allowing only https connections to repo. But solr 
> 6.6.2 version uses http connection. So builds are failing.
> But looks like latest version of solr has the fix to this in common_build.xml 
> and other places where it uses https connection to maven.
>  
> Error log:
> ivy-bootstrap1:
> [mkdir] Created dir: /root/.ant/lib
>  [echo] installing ivy 2.3.0 to /root/.ant/lib
>   [get] Getting: 
> [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
>   [get] To: /root/.ant/lib/ivy-2.3.0.jar
>   [get] Error opening connection 
> [java.io|http://java.io/]
> .IOException: Server returned HTTP response code: 501 for URL: 
> [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
>   [get] Error opening connection 
> [java.io|http://java.io/]
> .IOException: Server returned HTTP response code: 501 for URL: 
> [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
>   [get] Error opening connection 
> [java.io|http://java.io/]
> .IOException: Server returned HTTP response code: 501 for URL: 
> [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
>   [get] Can't get 
> [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
>  to /root/.ant/lib/ivy-2.3.0.jar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025032#comment-17025032
 ] 

Robert Muir commented on LUCENE-9185:
-

Attached is my initial stab... its helpful to me at least when tracking these 
things down. cc [~dweiss]

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>

[jira] [Updated] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020

2020-01-28 Thread Guruprasad K K (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guruprasad K K updated SOLR-14224:
--
Description: 
After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 
version uses http connection. So builds are failing.

But looks like latest version of solr has the fix to this in common_build.xml 
and other places where it uses https connection to maven.

 

Error log:
 ivy-bootstrap1:
 [mkdir] Created dir: /root/.ant/lib
 [echo] installing ivy 2.3.0 to /root/.ant/lib
 [get] Getting: 
 [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

[get] To: /root/.ant/lib/ivy-2.3.0.jar
 [get] Error opening connection 
 [java.io|http://java.io/]
 .IOException: Server returned HTTP response code: 501 for URL: 
 [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

[get] Error opening connection 
 [java.io|http://java.io/]
 .IOException: Server returned HTTP response code: 501 for URL: 
 [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

[get] Error opening connection 
 [java.io|http://java.io/]
 .IOException: Server returned HTTP response code: 501 for URL: 
 [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

[get] Can't get 
 [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
 to /root/.ant/lib/ivy-2.3.0.jar

 

 

 

 

[NOTE]: It works on latest version of solr, where http is converted to https

  was:
After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 
version uses http connection. So builds are failing.

But looks like latest version of solr has the fix to this in common_build.xml 
and other places where it uses https connection to maven.

 

Error log:
ivy-bootstrap1:
[mkdir] Created dir: /root/.ant/lib
 [echo] installing ivy 2.3.0 to /root/.ant/lib
  [get] Getting: 
[http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

  [get] To: /root/.ant/lib/ivy-2.3.0.jar
  [get] Error opening connection 
[java.io|http://java.io/]
.IOException: Server returned HTTP response code: 501 for URL: 
[http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

  [get] Error opening connection 
[java.io|http://java.io/]
.IOException: Server returned HTTP response code: 501 for URL: 
[http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

  [get] Error opening connection 
[java.io|http://java.io/]
.IOException: Server returned HTTP response code: 501 for URL: 
[http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]

  [get] Can't get 
[http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
 to /root/.ant/lib/ivy-2.3.0.jar


> Not able to build solr 6.6.2 from source after January 15, 2020
> ---
>
> Key: SOLR-14224
> URL: https://issues.apache.org/jira/browse/SOLR-14224
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.2
>Reporter: Guruprasad K K
>Priority: Major
>
> After Jan 15th maven is allowing only https connections to repo. But solr 
> 6.6.2 version uses http connection. So builds are failing.
> But looks like latest version of solr has the fix to this in common_build.xml 
> and other places where it uses https connection to maven.
>  
> Error log:
>  ivy-bootstrap1:
>  [mkdir] Created dir: /root/.ant/lib
>  [echo] installing ivy 2.3.0 to /root/.ant/lib
>  [get] Getting: 
>  [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
> [get] To: /root/.ant/lib/ivy-2.3.0.jar
>  [get] Error opening connection 
>  [java.io|http://java.io/]
>  .IOException: Server returned HTTP response code: 501 for URL: 
>  [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
> [get] Error opening connection 
>  [java.io|http://java.io/]
>  .IOException: Server returned HTTP response code: 501 for URL: 
>  [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
> [get] Error opening connection 
>  [java.io|http://java.io/]
>  .IOException: Server returned HTTP response code: 501 for URL: 
>  [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
> [get] Can't get 
>  [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
>  to /root/.ant/lib/ivy-2.3.0.jar
>  
>  
>  
>  
> [NOTE]: It works on latest version of solr, where http is converted to https



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9186) remove linefiledocs usage from basetokenstreamtestcase

2020-01-28 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9186:
---

 Summary: remove linefiledocs usage from basetokenstreamtestcase
 Key: LUCENE-9186
 URL: https://issues.apache.org/jira/browse/LUCENE-9186
 Project: Lucene - Core
  Issue Type: Task
  Components: general/test
Reporter: Robert Muir


LineFileDocs is slow, even to open. That's because it (very slowly) "skips" to 
a pseudorandom position into a 5MB gzip stream when you open it.

There was a time when we didn't have a nice string generator for tests 
(TestUtil.randomAnalysisString), but now we do. And when it was introduced it 
found interesting new things that linefiledocs never found.

This speeds up all the analyzer tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9186) remove linefiledocs usage from basetokenstreamtestcase

2020-01-28 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-9186:

Attachment: LUCENE-9186.patch

> remove linefiledocs usage from basetokenstreamtestcase
> --
>
> Key: LUCENE-9186
> URL: https://issues.apache.org/jira/browse/LUCENE-9186
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/test
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9186.patch
>
>
> LineFileDocs is slow, even to open. That's because it (very slowly) "skips" 
> to a pseudorandom position into a 5MB gzip stream when you open it.
> There was a time when we didn't have a nice string generator for tests 
> (TestUtil.randomAnalysisString), but now we do. And when it was introduced it 
> found interesting new things that linefiledocs never found.
> This speeds up all the analyzer tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable

2020-01-28 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9187:
---

 Summary: remove too-expensive assert from LZ4 
HighCompressionHashTable
 Key: LUCENE-9187
 URL: https://issues.apache.org/jira/browse/LUCENE-9187
 Project: Lucene - Core
  Issue Type: Task
Reporter: Robert Muir


This is the slowest method in the lucene tests. See LUCENE-9185 for what I mean.

If you look at it, its checking 64k values every time the assert is called.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable

2020-01-28 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-9187:

Attachment: LUCENE-9187.patch

> remove too-expensive assert from LZ4 HighCompressionHashTable
> -
>
> Key: LUCENE-9187
> URL: https://issues.apache.org/jira/browse/LUCENE-9187
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9187.patch
>
>
> This is the slowest method in the lucene tests. See LUCENE-9185 for what I 
> mean.
> If you look at it, its checking 64k values every time the assert is called.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025048#comment-17025048
 ] 

Robert Muir commented on LUCENE-9187:
-

cc [~jpountz]

> remove too-expensive assert from LZ4 HighCompressionHashTable
> -
>
> Key: LUCENE-9187
> URL: https://issues.apache.org/jira/browse/LUCENE-9187
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9187.patch
>
>
> This is the slowest method in the lucene tests. See LUCENE-9185 for what I 
> mean.
> If you look at it, its checking 64k values every time the assert is called.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] alessandrobenedetti commented on issue #357: [SOLR-12238] Synonym Queries boost by payload

2020-01-28 Thread GitBox
alessandrobenedetti commented on issue #357: [SOLR-12238] Synonym Queries boost 
by payload 
URL: https://github.com/apache/lucene-solr/pull/357#issuecomment-579200844
 
 
   I followed the refactor comments from both @diegoceccarelli  and @romseygeek 
.
   The PR seems much cleaner right now both Lucene and Solr side.
   Copious tests are present and should cover the various situations.
   
   Few questions remain:
   
   - from a test I read a comment from @dsmiley  saying: "confirm 
autoGeneratePhraseQueries always builds OR queries" from 
org.apache.solr.search.TestSolrQueryParser#testSynonymQueryStyle
   
   - what can we do for the SpanBoostQuery, I was completely not aware they are 
going to be deprecated
   
   Let me know
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-12238) Synonym Query Style Boost By Payload

2020-01-28 Thread Alessandro Benedetti (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-12238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025050#comment-17025050
 ] 

Alessandro Benedetti commented on SOLR-12238:
-

I followed the refactor comments from both @diegoceccarelli  and @romseygeek .
The PR seems much cleaner right now both Lucene and Solr side.
Copious tests are present and should cover the various situations.

Few questions remain:

- from a test I read a comment from @dsmiley  saying: "confirm 
autoGeneratePhraseQueries always builds OR queries" from 
org.apache.solr.search.TestSolrQueryParser#testSynonymQueryStyle

- what can we do for the SpanBoostQuery, I was completely not aware they are 
going to be deprecated

Let me know

> Synonym Query Style Boost By Payload
> 
>
> Key: SOLR-12238
> URL: https://issues.apache.org/jira/browse/SOLR-12238
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers
>Affects Versions: 7.2
>Reporter: Alessandro Benedetti
>Priority: Major
> Attachments: SOLR-12238.patch, SOLR-12238.patch, SOLR-12238.patch, 
> SOLR-12238.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This improvement is built on top of the Synonym Query Style feature and 
> brings the possibility of boosting synonym queries using the payload 
> associated.
> It introduces two new modalities for the Synonym Query Style :
> PICK_BEST_BOOST_BY_PAYLOAD -> build a Disjunction query with the clauses 
> boosted by payload
> AS_DISTINCT_TERMS_BOOST_BY_PAYLOAD -> build a Boolean query with the clauses 
> boosted by payload
> This new synonym query styles will assume payloads are available so they must 
> be used in conjunction with a token filter able to produce payloads.
> An synonym.txt example could be :
> # Synonyms used by Payload Boost
> tiger => tiger|1.0, Big_Cat|0.8, Shere_Khan|0.9
> leopard => leopard, Big_Cat|0.8, Bagheera|0.9
> lion => lion|1.0, panthera leo|0.99, Simba|0.8
> snow_leopard => panthera uncia|0.99, snow leopard|1.0
> A simple token filter to populate the payloads from such synonym.txt is :
>  delimiter="|"/>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson commented on issue #1218: LUCENE-9134: Javacc skeleton

2020-01-28 Thread GitBox
ErickErickson commented on issue #1218: LUCENE-9134: Javacc skeleton
URL: https://github.com/apache/lucene-solr/pull/1218#issuecomment-579219139
 
 
   Didn't title it right.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson closed pull request #1218: LUCENE-9134: Javacc skeleton

2020-01-28 Thread GitBox
ErickErickson closed pull request #1218: LUCENE-9134: Javacc skeleton
URL: https://github.com/apache/lucene-solr/pull/1218
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson opened a new pull request #1219: LUCENE-9134: Javacc skeleton for Gradle regenerate

2020-01-28 Thread GitBox
ErickErickson opened a new pull request #1219:  LUCENE-9134: Javacc skeleton 
for Gradle regenerate
URL: https://github.com/apache/lucene-solr/pull/1219
 
 
   Here's the build changes to get javacc to run, modeled on the jflex changes 
, many thanks for the model. Only two files changed here ;)
   
   If the structure is OK, I'll fill in the "doLast" blocks with the cleanup 
code and maybe be able extract some common parts. NOTE: you can't even compile 
the result of running this because I wanted the changes to the build structure 
to be clear first so didn't include the cleanup tasks yet.
   
   So if this structure is OK, should I merge it into master before or after 
the rest of the cleanup? My assumption is after. I want to try to get all the 
warnings etc. out of the generated code in the next phase to reduce the 
temptation for people to make hand-edits.
   
   I didn't intentionally change the line endings in defaults-java, there's no 
other change there...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025074#comment-17025074
 ] 

Dawid Weiss commented on LUCENE-9185:
-

Great beyond words. I never had a chance to use jfr but I'll surely want to 
dig. A few nitpicks:
{code}
+allprojects {
+  tasks.withType(Test) {
+def profileMode = propertyOrDefault("tests.profile", false)
+if (profileMode) {
{code}
you can apply the if outside and only do allprojects closure if tests.profile 
is true at the root project level (I assume we won't have to enable it for 
individual projects within a larger build).

{code}
+gradlew -p lucene/core test -Dtests.profile=true
{code}

It will work but -Ptests.profile=true would be more gradle-sque (it sets a 
project property as opposed to system property).

{code}
+gradle.buildFinished {
+  if (!recordings.isEmpty()) {
+def args = ["ProfileResults"]
+for (file in recordings.getFiles()) {
+  args += file.toString()
+}
+ProfileResults.main(args as String[])
+  }
+}
{code}

If you pull up the if then this thing can go underneath so that it's not adding 
any closure if it's not enabled. Also: it'll always display the profile, even 
on a failed build. Look at slowest-tests-at-end.gradle - this one only displays 
the slowest tests if the build is successful.

Finally you may want to simplify to something like (didn't check but should 
work):
{code}
 def args = ["ProfileResults"]
 args += recordings.getFiles().collect { it.toString() }
{code}

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
>

[jira] [Commented] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable

2020-01-28 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025085#comment-17025085
 ] 

Adrien Grand commented on LUCENE-9187:
--

This profile option is pretty cool.

+1 to removing the assert, I'd like to make it a dedicated test instead but it 
doesn't have to block the removal of the assertion.

> remove too-expensive assert from LZ4 HighCompressionHashTable
> -
>
> Key: LUCENE-9187
> URL: https://issues.apache.org/jira/browse/LUCENE-9187
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9187.patch
>
>
> This is the slowest method in the lucene tests. See LUCENE-9185 for what I 
> mean.
> If you look at it, its checking 64k values every time the assert is called.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9186) remove linefiledocs usage from basetokenstreamtestcase

2020-01-28 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025091#comment-17025091
 ] 

Dawid Weiss commented on LUCENE-9186:
-

+1.

> remove linefiledocs usage from basetokenstreamtestcase
> --
>
> Key: LUCENE-9186
> URL: https://issues.apache.org/jira/browse/LUCENE-9186
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/test
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9186.patch
>
>
> LineFileDocs is slow, even to open. That's because it (very slowly) "skips" 
> to a pseudorandom position into a 5MB gzip stream when you open it.
> There was a time when we didn't have a nice string generator for tests 
> (TestUtil.randomAnalysisString), but now we do. And when it was introduced it 
> found interesting new things that linefiledocs never found.
> This speeds up all the analyzer tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9134) Port ant-regenerate tasks to Gradle build

2020-01-28 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025101#comment-17025101
 ] 

Erick Erickson commented on LUCENE-9134:


New PR with skeleton of javacc changes. Just for the structure of the Gradle 
changes, won't be committable until after the post-generation cleanup is done.

> Port ant-regenerate tasks to Gradle build
> -
>
> Key: LUCENE-9134
> URL: https://issues.apache.org/jira/browse/LUCENE-9134
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: LUCENE-9134.patch, core_regen.patch
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Take II about organizing this beast.
>  A list of items that needs to be added or requires work. If you'd like to 
> work on any of these, please add your name to the list. See process comments 
> at parent (LUCENE-9077)
>  * Implement jflex task in lucene/core
>  * Implement jflex tasks in lucene/analysis
>  * Implement javacc tasks in lucene/queryparser (EOE)
>  * Implement javacc tasks in solr/core (EOE)
>  * Implement python tasks in lucene (? there are several javadocs mentions in 
> the build.xml, this may be irrelevant to the Gradle effort).
>  * Implement python tasks in lucene/core
>  * Implement python tasks in lucene/analysis
>  
> Here are the "regenerate" targets I found in the ant version. There are a 
> couple that I don't have evidence for or against being rebuilt
>  // Very top level
> {code:java}
> ./build.xml: 
> ./build.xml:  failonerror="true">
> ./build.xml:  depends="regenerate,-check-after-regeneration"/>
>  {code}
> // top level Lucene. This includes the core/build.xml and 
> test-framework/build.xml files
> {code:java}
> ./lucene/build.xml: 
> ./lucene/build.xml:  inheritall="false">
> ./lucene/build.xml: 
>  {code}
> // This one has quite a number of customizations to
> {code:java}
> ./lucene/core/build.xml:  depends="createLevAutomata,createPackedIntSources,jflex"/>
>  {code}
> // This one has a bunch of code modifications _after_ javacc is run on 
> certain of the
>  // output files. Save this one for last?
> {code:java}
> ./lucene/queryparser/build.xml: 
>  {code}
> // the files under ../lucene/analysis... are pretty self contained. I expect 
> these could be done as a unit
> {code:java}
> ./lucene/analysis/build.xml: 
> ./lucene/analysis/build.xml: 
> ./lucene/analysis/common/build.xml:  depends="jflex,unicode-data"/>
> ./lucene/analysis/icu/build.xml:  depends="gen-utr30-data-files,gennorm2,genrbbi"/>
> ./lucene/analysis/kuromoji/build.xml:  depends="build-dict"/>
> ./lucene/analysis/nori/build.xml:  depends="build-dict"/>
> ./lucene/analysis/opennlp/build.xml:  depends="train-test-models"/>
>  {code}
>  
> // These _are_ regenerated from the top-level regenerate target, but for –
> LUCENE-9080//the changes were only in imports so there are no
> //corresponding files checked in in that JIRA
> {code:java}
> ./lucene/expressions/build.xml:  depends="run-antlr"/>
>  {code}
> // Apparently unrelated to ./lucene/analysis/opennlp/build.xml 
> "train-test-models" target
> // Apparently not rebuilt from the top level, but _are_ regenerated when 
> executed from
> // ./solr/contrib/langid
> {code:java}
> ./solr/contrib/langid/build.xml:  depends="train-test-models"/>
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020

2020-01-28 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-14224.
---
Resolution: Invalid

We stopped active support for Solr 6x quite some time ago and will not be 
releasing any new versions. Arguing about whether it's a bug or not is 
pointless, please ask the question on the user's list as Jan suggested and do 
not reopen this JIRA.

> Not able to build solr 6.6.2 from source after January 15, 2020
> ---
>
> Key: SOLR-14224
> URL: https://issues.apache.org/jira/browse/SOLR-14224
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.2
>Reporter: Guruprasad K K
>Priority: Major
>
> After Jan 15th maven is allowing only https connections to repo. But solr 
> 6.6.2 version uses http connection. So builds are failing.
> But looks like latest version of solr has the fix to this in common_build.xml 
> and other places where it uses https connection to maven.
>  
> Error log:
>  ivy-bootstrap1:
>  [mkdir] Created dir: /root/.ant/lib
>  [echo] installing ivy 2.3.0 to /root/.ant/lib
>  [get] Getting: 
>  [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
> [get] To: /root/.ant/lib/ivy-2.3.0.jar
>  [get] Error opening connection 
>  [java.io|http://java.io/]
>  .IOException: Server returned HTTP response code: 501 for URL: 
>  [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
> [get] Error opening connection 
>  [java.io|http://java.io/]
>  .IOException: Server returned HTTP response code: 501 for URL: 
>  [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
> [get] Error opening connection 
>  [java.io|http://java.io/]
>  .IOException: Server returned HTTP response code: 501 for URL: 
>  [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
> [get] Can't get 
>  [http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar]
>  to /root/.ant/lib/ivy-2.3.0.jar
>  
>  
>  
>  
> [NOTE]: It works on latest version of solr, where http is converted to https



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton

2020-01-28 Thread GitBox
ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc 
skeleton
URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371799564
 
 

 ##
 File path: gradle/defaults-java.gradle
 ##
 @@ -25,13 +25,13 @@ allprojects {
 tasks.withType(JavaCompile) {
   options.encoding = "UTF-8"
   options.compilerArgs += [
-"-Xlint", 
 
 Review comment:
   OK, I'll check. I don't even know how they got changed frankly, I'll revert


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton

2020-01-28 Thread GitBox
ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc 
skeleton
URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371799808
 
 

 ##
 File path: gradle/generation/javacc.gradle
 ##
 @@ -0,0 +1,102 @@
+// Add a top-level pseudo-task to which we will attach individual regenerate 
tasks.
+import static groovy.io.FileType.*
+
+configure(rootProject) {
+  configurations {
+javacc
+  }
+
+  dependencies {
+javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}"
+  }
+
+  task javacc() {
+description "Regenerate sources for corresponding javacc grammar files."
+group "generation"
+
+dependsOn ":lucene:queryparser:javaccParserClassic"
+dependsOn ":lucene:queryparser:javaccParserSurround"
+dependsOn ":lucene:queryparser:javaccParserFlexible"
+  }
+}
+
+// We always regenerate, no need to declare outputs.
+class JavaCCTask extends DefaultTask {
+  @Input
+  File javaccFile
+
+  JavaCCTask() {
+dependsOn(project.rootProject.configurations.javacc)
+  }
+
+  @TaskAction
+  def generate() {
+if (!javaccFile || !javaccFile.exists()) {
+  throw new RuntimeException("JavaCC input file does not exist: 
${javaccFile}")
+}
+// Remove old files so we can regenerate them
+def parentDir = javaccFile.parentFile
+parentDir.eachFileMatch FILES, ~/.*\.java/, { file ->
 
 Review comment:
   I copied it from some example and it worked...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton

2020-01-28 Thread GitBox
ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc 
skeleton
URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371801798
 
 

 ##
 File path: gradle/generation/javacc.gradle
 ##
 @@ -0,0 +1,102 @@
+// Add a top-level pseudo-task to which we will attach individual regenerate 
tasks.
+import static groovy.io.FileType.*
+
+configure(rootProject) {
+  configurations {
+javacc
+  }
+
+  dependencies {
+javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}"
+  }
+
+  task javacc() {
+description "Regenerate sources for corresponding javacc grammar files."
+group "generation"
+
+dependsOn ":lucene:queryparser:javaccParserClassic"
+dependsOn ":lucene:queryparser:javaccParserSurround"
+dependsOn ":lucene:queryparser:javaccParserFlexible"
+  }
+}
+
+// We always regenerate, no need to declare outputs.
+class JavaCCTask extends DefaultTask {
+  @Input
+  File javaccFile
+
+  JavaCCTask() {
+dependsOn(project.rootProject.configurations.javacc)
+  }
+
+  @TaskAction
+  def generate() {
+if (!javaccFile || !javaccFile.exists()) {
+  throw new RuntimeException("JavaCC input file does not exist: 
${javaccFile}")
+}
+// Remove old files so we can regenerate them
+def parentDir = javaccFile.parentFile
+parentDir.eachFileMatch FILES, ~/.*\.java/, { file ->
 
 Review comment:
   Actually, they aren't overwritten. If they're not deleted you get messages 
during execution like: "Warning: TokenMgrError.java: File is obsolete.  Please 
rename or delete this file so that a new one can be generated for you."


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton

2020-01-28 Thread GitBox
ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc 
skeleton
URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371801937
 
 

 ##
 File path: gradle/generation/javacc.gradle
 ##
 @@ -0,0 +1,102 @@
+// Add a top-level pseudo-task to which we will attach individual regenerate 
tasks.
+import static groovy.io.FileType.*
+
+configure(rootProject) {
+  configurations {
+javacc
+  }
+
+  dependencies {
+javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}"
+  }
+
+  task javacc() {
+description "Regenerate sources for corresponding javacc grammar files."
+group "generation"
+
+dependsOn ":lucene:queryparser:javaccParserClassic"
+dependsOn ":lucene:queryparser:javaccParserSurround"
+dependsOn ":lucene:queryparser:javaccParserFlexible"
+  }
+}
+
+// We always regenerate, no need to declare outputs.
+class JavaCCTask extends DefaultTask {
+  @Input
+  File javaccFile
+
+  JavaCCTask() {
+dependsOn(project.rootProject.configurations.javacc)
+  }
+
+  @TaskAction
+  def generate() {
+if (!javaccFile || !javaccFile.exists()) {
+  throw new RuntimeException("JavaCC input file does not exist: 
${javaccFile}")
+}
+// Remove old files so we can regenerate them
+def parentDir = javaccFile.parentFile
+parentDir.eachFileMatch FILES, ~/.*\.java/, { file ->
+  if (file.text.contains("Generated By:JavaCC")) {
+file.delete()
+  }
+}
+logger.lifecycle("Regenerating JavaCC:\n  from: ${javaccFile}\nto: 
${parentDir}")
+
+project.javaexec {
+  classpath {
+project.rootProject.configurations.javacc
+  }
+  main = "org.javacc.parser.Main"
+  args += "-OUTPUT_DIRECTORY=${parentDir}"
+  args += [javaccFile]
 
 Review comment:
   I'll change.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025108#comment-17025108
 ] 

Robert Muir commented on LUCENE-9185:
-

{quote}
Also: it'll always display the profile, even on a failed build. Look at 
slowest-tests-at-end.gradle - this one only displays the slowest tests if the 
build is successful.
{quote}

Honestly when looking at slow solr tests, I remove that logic locally from 
{{slowest-tests-at-end.gradle}}. It takes me 80 minutes to run solr tests, and 
90% of the time some test fails and then i get no output from it at all. This 
is frustrating because then I wasted 80 minutes. I feel the same way about it 
here, its about performance, and you asked for profile output, and it found jfr 
files, why not show it?

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.ec

[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025110#comment-17025110
 ] 

Dawid Weiss commented on LUCENE-9185:
-

Ok, fair enough. With profiling it's explicit; those slow-tests are always 
shown. Maybe we should make the latter optional as well?

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run()

[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025112#comment-17025112
 ] 

Robert Muir commented on LUCENE-9185:
-

{quote}
It will work but -Ptests.profile=true would be more gradle-sque (it sets a 
project property as opposed to system property).
{quote}

The tool uses actual system properties for the more advanced options (e.g. 
{{-Dtests.profile.count=20}}). Seems a little evil to mix -P's and -D's when 
documenting this? I'll be honest, the difference is super confusing.

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.E

[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025116#comment-17025116
 ] 

Robert Muir commented on LUCENE-9185:
-

{quote}
Maybe we should make the latter optional as well?
{quote}

Do you mean the whole {{slowest-test-at-end}}? Given how insanely slow some of 
these tests are, I feel it should be mandatory to see it? :)

But if i had to ask for a wishlist of improvements to {{slowest-tests-at-end}}, 
they would be:
* option (or change behavior) to print them always, even if a test sporadically 
failed.
* property to increase the count (e.g. from 10 to 100) and threshold (e.g. from 
500ms to 250ms, yes we may get there soon in lucene!)
* some way to show or count beforeclass/afterclass time. I'm not sure it is 
currently considered, only time for each method (i assume that includes 
setup+teardown)
* some way to see the slowest suites, too. Even if we fix all the tests to be 
100ms, it can cause bottlenecks if a suite has a TON of tests, because of bad 
gradle load balancing.

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.Se

[GitHub] [lucene-solr] shalinmangar opened a new pull request #1220: SOLR-13996: Refactor HttpShardHandler.prepDistributed method

2020-01-28 Thread GitBox
shalinmangar opened a new pull request #1220: SOLR-13996: Refactor 
HttpShardHandler.prepDistributed method
URL: https://github.com/apache/lucene-solr/pull/1220
 
 
   # Description
   
   This PR refactors the huge HttpShardHandler.prepDistributed method into 
smaller pieces.
   
   # Solution
   
   It separates the logic for cloud and non-cloud modes into separate classes 
which are implementations of a new (experimental/internal) interface named 
ReplicaSource.
   
   # Tests
   
   This PR passes all current tests and I'll add more tests before merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-13996) Refactor HttpShardHandler#prepDistributed() into smaller pieces

2020-01-28 Thread Shalin Shekhar Mangar (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-13996:


Assignee: Shalin Shekhar Mangar

> Refactor HttpShardHandler#prepDistributed() into smaller pieces
> ---
>
> Key: SOLR-13996
> URL: https://issues.apache.org/jira/browse/SOLR-13996
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
>Priority: Major
> Attachments: SOLR-13996.patch, SOLR-13996.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, it is very hard to understand all the various things being done in 
> HttpShardHandler. I'm starting with refactoring the prepDistributed() method 
> to make it easier to grasp. It has standalone and cloud code intertwined, and 
> wanted to cleanly separate them out. Later, we can even have two separate 
> method (for standalone and cloud, each).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025120#comment-17025120
 ] 

Dawid Weiss commented on LUCENE-9185:
-

It looks odd to your eyes because it's a legacy from ant. These are different 
things: system properties are global, project properties are local (or looked 
up via scopes). You can set project properties with finer granularity than 
globally. 

As for the patch: it works because you invoke a static method on that class and 
it inherits gradle's environment. A nicer way to do it would be to pass 
arguments like tests.profile.count explicitly to ProfileResults (via args, 
setters or otherwise) preparing them on gradle side.

The propertyOrDefault utility is actually a hack in the build so that people 
used to global system properties can still pass them to gradle build... maybe 
it was a mistake that I added it in the first place, don't know.


> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():47

[jira] [Commented] (SOLR-13996) Refactor HttpShardHandler#prepDistributed() into smaller pieces

2020-01-28 Thread Shalin Shekhar Mangar (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025122#comment-17025122
 ] 

Shalin Shekhar Mangar commented on SOLR-13996:
--

I've been working on a refactoring of this method and it's my fault that I 
didn't see this issue and the PR earlier. However, my goals are a bit more 
ambitious. This first PR https://github.com/apache/lucene-solr/pull/1220 is 
just a re-organization of the code but I'll be expanding it further by adding 
tests for each individual case and then move on to improve performance. 
Currently this class is quite inefficient as it parses and re-parses and 
creates strings out of shard urls even for solr cloud cases. The goal is to 
eventually have a cloud focused class that is extremely efficient and avoids 
unnecessary copies of shards/replicas completely. This will require changes in 
other places as well e.g. the host checker can be made to operate in a 
streaming mode etc. I haven't quite decided on how the replica list transformer 
should be changed.

I hope you don't mind Ishan but I'll assign this issue and take this forward. 
Reviews welcome!

> Refactor HttpShardHandler#prepDistributed() into smaller pieces
> ---
>
> Key: SOLR-13996
> URL: https://issues.apache.org/jira/browse/SOLR-13996
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
>Priority: Major
> Attachments: SOLR-13996.patch, SOLR-13996.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, it is very hard to understand all the various things being done in 
> HttpShardHandler. I'm starting with refactoring the prepDistributed() method 
> to make it easier to grasp. It has standalone and cloud code intertwined, and 
> wanted to cleanly separate them out. Later, we can even have two separate 
> method (for standalone and cloud, each).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9171) Synonyms Boost by Payload

2020-01-28 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025124#comment-17025124
 ] 

David Smiley commented on LUCENE-9171:
--

I have my doubts on AttributeSource and arrays of such; I'll put my comments in 
the PR in a minute.

BTW I agree with Alan about keeping things simple in its base class.  In Lucene 
we fight complexity all the time.

> Synonyms Boost by Payload
> -
>
> Key: LUCENE-9171
> URL: https://issues.apache.org/jira/browse/LUCENE-9171
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/queryparser
>Reporter: Alessandro Benedetti
>Priority: Major
>
> I have been working in the additional capability of boosting queries by terms 
> payload through a parameter to enable it in Lucene Query Builder.
> This has been done targeting the Synonyms Query.
> It is parametric, so it meant to see no difference unless the feature is 
> enabled.
> Solr has its bits to comply thorugh its SynonymsQueryStyles



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on issue #357: [SOLR-12238] Synonym Queries boost by payload

2020-01-28 Thread GitBox
dsmiley commented on issue #357: [SOLR-12238] Synonym Queries boost by payload 
URL: https://github.com/apache/lucene-solr/pull/357#issuecomment-579257834
 
 
   I noticed the use of {{AttributeSource[]}} (array of AttributeSource), done 
at the behest of @romseygeek .  That seems fishy... shouldn't it be a 
TokenStream, which is a more memory efficient iterator over AttributeSource 
changing state?  I see, for example, the _existing_ 
{{createSpanQuery(TokenStream in, String field)}} but the PR adds 
{{newSpanQuery(String field, AttributeSource[] attributes)}} and makes the 
former call the latter.  Why bother; why not retain createSpanQuery and if Solr 
wants to override it to do payload boosting then it can?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025128#comment-17025128
 ] 

Dawid Weiss commented on LUCENE-9185:
-

bq. But if i had to ask for a wishlist of improvements

All of them make sense but you're killing me... ;)

It's also worth nothing that the "slowest tests" list depends on the level of 
parallelism and what other tests ran in the background alongside (one memory or 
I/O heavy test slows down everything running with it).


> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
> 

[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025129#comment-17025129
 ] 

Robert Muir commented on LUCENE-9185:
-

{quote}
As for the patch: it works because you invoke a static method on that class and 
it inherits gradle's environment. A nicer way to do it would be to pass 
arguments like tests.profile.count explicitly to ProfileResults (via args, 
setters or otherwise) preparing them on gradle side.
{quote}

I know, i wanted to keep a simple main() method, to make it easy to improve or 
fix bugs, iterate quickly, e.g.
{noformat}
$ java buildSrc/src/main/java/org/apache/lucene/gradle/ProfileResults.java 
./lucene/analysis/opennlp/build/tmp/tests-cwd/hotspot-pid-133619-id-1-2020_01_28_06_11_03.jfr
 
./lucene/analysis/opennlp/build/tmp/tests-cwd/hotspot-pid-133548-id-1-2020_01_28_06_11_02.jfr
PROFILE SUMMARY from 306 samples
  tests.profile.count=10
  tests.profile.stacksize=1
  tests.profile.linenumbers=false
PERCENT SAMPLES STACK
13.73%  42  java.util.zip.Inflater#inflateBytesBytes()
2.94%   9   java.lang.StringLatin1#indexOf()
2.61%   8   java.io.UnixFileSystem#getBooleanAttributes0()
2.29%   7   java.util.DualPivotQuicksort#sort()
1.96%   6   java.lang.StringLatin1#charAt()
1.96%   6   java.io.UnixFileSystem#normalize()
1.63%   5   java.lang.StringLatin1#inflate()
1.31%   4   java.lang.String#startsWith()
1.31%   4   java.lang.ClassLoader#defineClass1()
1.31%   4   java.lang.StringLatin1#compareTo()
{noformat}

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>  

[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025142#comment-17025142
 ] 

Robert Muir commented on LUCENE-9185:
-

{quote}
All of them make sense but you're killing me...

It's also worth nothing that the "slowest tests" list depends on the level of 
parallelism and what other tests ran in the background alongside (one memory or 
I/O heavy test slows down everything running with it).
{quote}

I know, but its all the rudimentary "profiling" we have at the moment.  Trying 
to change that!

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.threa

[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc skeleton

2020-01-28 Thread GitBox
ErickErickson commented on a change in pull request #1218: LUCENE-9134: Javacc 
skeleton
URL: https://github.com/apache/lucene-solr/pull/1218#discussion_r371828729
 
 

 ##
 File path: gradle/generation/javacc.gradle
 ##
 @@ -0,0 +1,102 @@
+// Add a top-level pseudo-task to which we will attach individual regenerate 
tasks.
+import static groovy.io.FileType.*
+
+configure(rootProject) {
+  configurations {
+javacc
+  }
+
+  dependencies {
+javacc "net.java.dev.javacc:javacc:${scriptDepVersions['javacc']}"
+  }
+
+  task javacc() {
+description "Regenerate sources for corresponding javacc grammar files."
+group "generation"
+
+dependsOn ":lucene:queryparser:javaccParserClassic"
+dependsOn ":lucene:queryparser:javaccParserSurround"
+dependsOn ":lucene:queryparser:javaccParserFlexible"
+  }
+}
+
+// We always regenerate, no need to declare outputs.
+class JavaCCTask extends DefaultTask {
+  @Input
+  File javaccFile
+
+  JavaCCTask() {
+dependsOn(project.rootProject.configurations.javacc)
+  }
+
+  @TaskAction
+  def generate() {
+if (!javaccFile || !javaccFile.exists()) {
+  throw new RuntimeException("JavaCC input file does not exist: 
${javaccFile}")
+}
+// Remove old files so we can regenerate them
+def parentDir = javaccFile.parentFile
+parentDir.eachFileMatch FILES, ~/.*\.java/, { file ->
+  if (file.text.contains("Generated By:JavaCC")) {
+file.delete()
+  }
+}
+logger.lifecycle("Regenerating JavaCC:\n  from: ${javaccFile}\nto: 
${parentDir}")
+
+project.javaexec {
+  classpath {
+project.rootProject.configurations.javacc
+  }
+  main = "org.javacc.parser.Main"
+  args += "-OUTPUT_DIRECTORY=${parentDir}"
+  args += [javaccFile]
+}
+  }
+}
+
+
+configure(project(":lucene:queryparser")) {
+  task javaccParserClassic(type: JavaCCTask) {
+description "Regenerate classic query parser from java CC.java"
+group "generation"
+
+javaccFile = 
file('src/java/org/apache/lucene/queryparser/classic/QueryParser.jj')
+def parent = javaccFile.parentFile.toString() // I'll need this later.
+
+doLast {
+  // There'll be a lot of cleanup in here to get precommits and builds to 
pass, but as long as we don't
 
 Review comment:
   That _should_ be the end product already, that's one of the reasons I spent 
so much time on the ant version and why all those files were changed when I 
committed. At least I _think_ I got them all. At least that's what I remember 
doing... That said I'll try not to go off in the weeds.
   
   Now that I've got the structure right, I'll see if I can get this to happen. 
Shouldn't actually be that much.
   
   Oh, and ignore PR 1219, I had a bad title for this PR and it didn't link. 
When I changed the title of this one it took a while to show up and I got 
impatient. 1219 and 1218 are identical.
   
   Finally, many thanks for your coaching (well, ok, outright fixing things)!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-9185:

Attachment: LUCENE-9185.patch

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch, LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor#lambda$execut

[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025193#comment-17025193
 ] 

Robert Muir commented on LUCENE-9185:
-

[~dweiss] I tried to fold in your feedback, can you take another look?

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch, LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   

[jira] [Created] (SOLR-14225) Upgrade jaegertracing

2020-01-28 Thread Jira
Jan Høydahl created SOLR-14225:
--

 Summary: Upgrade jaegertracing
 Key: SOLR-14225
 URL: https://issues.apache.org/jira/browse/SOLR-14225
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Jan Høydahl


Upgrade jaegertracing from 0.35.5 to 1.1.0. This will also give us a newer 
libthrift which is more stable and secure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-9185:

Attachment: LUCENE-9185.patch

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExe

[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025220#comment-17025220
 ] 

Robert Muir commented on LUCENE-9185:
-

I added a bunch of crazy abstractions and constants to the java code so that 
the gradle code looks a little prettier. I realize you really hate how i did it 
before, but I want to keep the simple main method, and I don't think gradle's 
bad decisions should get in the way of that.

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>

[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025233#comment-17025233
 ] 

Dawid Weiss commented on LUCENE-9185:
-

It looks great, thanks Robert. 

I'd love to have some kind of task to display all these build options at some 
point. Currently this is done just for randomization options (try gradlew 
testOpts -p lucene/core) but I'm sure it could be pulled from other parts of 
the build and displayed consistently. For now it can stay as it is.

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.stra

[jira] [Commented] (SOLR-14225) Upgrade jaegertracing

2020-01-28 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025234#comment-17025234
 ] 

Dawid Weiss commented on SOLR-14225:


It'd be great if the patch included corresponding gradle updates, Jan (if you 
have problems with something, let me know).

> Upgrade jaegertracing
> -
>
> Key: SOLR-14225
> URL: https://issues.apache.org/jira/browse/SOLR-14225
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Priority: Major
>
> Upgrade jaegertracing from 0.35.5 to 1.1.0. This will also give us a newer 
> libthrift which is more stable and secure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025238#comment-17025238
 ] 

Robert Muir commented on LUCENE-9185:
-

I agree, it would be nice. For now I added basic usage to the help and the 
reporter itself prints out the values of any fancy options.

just trying to make it as easy as possible to keep the slow tests at bay...

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYou

[jira] [Created] (LUCENE-9188) Add jacoco code coverage support to gradle build

2020-01-28 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9188:
---

 Summary: Add jacoco code coverage support to gradle build
 Key: LUCENE-9188
 URL: https://issues.apache.org/jira/browse/LUCENE-9188
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir


Seems to be missing. I looked into it a little, all the documented ways of 
using the jacoco plugin seem to involve black magic if you are using "java" 
plugin, but we are using "javaLibrary", so I wasn't able to hold it right.

This one should work very well, it has low overhead and should work fine 
running tests in parallel (since it supports merging of coverage data files: 
that's how it works in the ant build)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] alessandrobenedetti commented on issue #357: [SOLR-12238] Synonym Queries boost by payload

2020-01-28 Thread GitBox
alessandrobenedetti commented on issue #357: [SOLR-12238] Synonym Queries boost 
by payload 
URL: https://github.com/apache/lucene-solr/pull/357#issuecomment-579326000
 
 
   No strong opinion on that, it was actually the first time I used the 
AttributeSource so I am happy to switch to TokenStream if it is more memory 
efficient.
   The change shouldn't be too heavy.
   I will just wait for confirmation, and when we are all aligned I'll proceed 
with the implementation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9004) Approximate nearest vector search

2020-01-28 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025251#comment-17025251
 ] 

Michael Sokolov commented on LUCENE-9004:
-

> Is there any possible to merge LUCENE-9136 with this issue?

This is already gigantic - what would be the benefit of merging?

> Approximate nearest vector search
> -
>
> Key: LUCENE-9004
> URL: https://issues.apache.org/jira/browse/LUCENE-9004
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael Sokolov
>Priority: Major
> Attachments: hnsw_layered_graph.png
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> "Semantic" search based on machine-learned vector "embeddings" representing 
> terms, queries and documents is becoming a must-have feature for a modern 
> search engine. SOLR-12890 is exploring various approaches to this, including 
> providing vector-based scoring functions. This is a spinoff issue from that.
> The idea here is to explore approximate nearest-neighbor search. Researchers 
> have found an approach based on navigating a graph that partially encodes the 
> nearest neighbor relation at multiple scales can provide accuracy > 95% (as 
> compared to exact nearest neighbor calculations) at a reasonable cost. This 
> issue will explore implementing HNSW (hierarchical navigable small-world) 
> graphs for the purpose of approximate nearest vector search (often referred 
> to as KNN or k-nearest-neighbor search).
> At a high level the way this algorithm works is this. First assume you have a 
> graph that has a partial encoding of the nearest neighbor relation, with some 
> short and some long-distance links. If this graph is built in the right way 
> (has the hierarchical navigable small world property), then you can 
> efficiently traverse it to find nearest neighbors (approximately) in log N 
> time where N is the number of nodes in the graph. I believe this idea was 
> pioneered in  [1]. The great insight in that paper is that if you use the 
> graph search algorithm to find the K nearest neighbors of a new document 
> while indexing, and then link those neighbors (undirectedly, ie both ways) to 
> the new document, then the graph that emerges will have the desired 
> properties.
> The implementation I propose for Lucene is as follows. We need two new data 
> structures to encode the vectors and the graph. We can encode vectors using a 
> light wrapper around {{BinaryDocValues}} (we also want to encode the vector 
> dimension and have efficient conversion from bytes to floats). For the graph 
> we can use {{SortedNumericDocValues}} where the values we encode are the 
> docids of the related documents. Encoding the interdocument relations using 
> docids directly will make it relatively fast to traverse the graph since we 
> won't need to lookup through an id-field indirection. This choice limits us 
> to building a graph-per-segment since it would be impractical to maintain a 
> global graph for the whole index in the face of segment merges. However 
> graph-per-segment is a very natural at search time - we can traverse each 
> segments' graph independently and merge results as we do today for term-based 
> search.
> At index time, however, merging graphs is somewhat challenging. While 
> indexing we build a graph incrementally, performing searches to construct 
> links among neighbors. When merging segments we must construct a new graph 
> containing elements of all the merged segments. Ideally we would somehow 
> preserve the work done when building the initial graphs, but at least as a 
> start I'd propose we construct a new graph from scratch when merging. The 
> process is going to be  limited, at least initially, to graphs that can fit 
> in RAM since we require random access to the entire graph while constructing 
> it: In order to add links bidirectionally we must continually update existing 
> documents.
> I think we want to express this API to users as a single joint 
> {{KnnGraphField}} abstraction that joins together the vectors and the graph 
> as a single joint field type. Mostly it just looks like a vector-valued 
> field, but has this graph attached to it.
> I'll push a branch with my POC and would love to hear comments. It has many 
> nocommits, basic design is not really set, there is no Query implementation 
> and no integration iwth IndexSearcher, but it does work by some measure using 
> a standalone test class. I've tested with uniform random vectors and on my 
> laptop indexed 10K documents in around 10 seconds and searched them at 95% 
> recall (compared with exact nearest-neighbor baseline) at around 250 QPS. I 
> haven't made any attempt to use multithreaded search for this, but it is 
> amenable to per-segment concurrency.
> [1] 
> [https://www.semanticscholar.org/

[jira] [Created] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes

2020-01-28 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9189:
---

 Summary: TestIndexWriterDelete.testDeletesOnDiskFull can run for 
minutes
 Key: LUCENE-9189
 URL: https://issues.apache.org/jira/browse/LUCENE-9189
 Project: Lucene - Core
  Issue Type: Task
Reporter: Robert Muir


I thought it was just the testUpdatesOnDiskFull, but looks like this one needs 
to be nightly too.

Should look more into the test, but I know something causes it to make such an 
insane amount of files, that sorting them becomes a bottleneck.

I guess also related is that it would be great if MockDirectoryWrapper's disk 
full check didn't trigger a sort of the files (via listAll): it does this check 
on like every i/o, would be nice for it to be less absurd. Maybe instead the 
test could check for disk full on not every i/o but some random sample of them?

Temporarily lets make it nightly...

{noformat}
PROFILE SUMMARY from 182501 samples
  tests.profile.count=10
  tests.profile.stacksize=1
  tests.profile.linenumbers=false
PERCENT SAMPLES STACK
15.89%  28995   java.lang.StringLatin1#compareTo()
6.61%   12069   java.util.TimSort#mergeHi()
5.96%   10878   java.util.TimSort#binarySort()
3.41%   6231java.util.concurrent.ConcurrentHashMap#tabAt()
2.98%   5433java.util.Comparators$NaturalOrderComparator#compare()
2.12%   3876org.apache.lucene.store.DataOutput#copyBytes()
2.03%   3712java.lang.String#compareTo()
1.84%   3350java.util.concurrent.ConcurrentHashMap#get()
1.83%   3337java.util.TimSort#mergeLo()
1.67%   3047java.util.ArrayList#add()
{noformat}

All the file sorting is called from stacks like this, so its literally 
happening every writeByte() and so on

{noformat}
0.73%   1329java.util.TimSort#binarySort()
  at java.util.TimSort#sort()
  at java.util.Arrays#sort()
  at java.util.ArrayList#sort()
  at java.util.stream.SortedOps$RefSortingSink#end()
  at java.util.stream.AbstractPipeline#copyInto()
  at java.util.stream.AbstractPipeline#wrapAndCopyInto()
  at java.util.stream.AbstractPipeline#evaluate()
  at 
java.util.stream.AbstractPipeline#evaluateToArrayNode()
  at java.util.stream.ReferencePipeline#toArray()
  at 
org.apache.lucene.store.ByteBuffersDirectory#listAll()
  at 
org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes()
  at 
org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull()
  at 
org.apache.lucene.store.MockIndexOutputWrapper#writeBytes()
  at 
org.apache.lucene.store.MockIndexOutputWrapper#writeByte()
  at org.apache.lucene.store.DataOutput#writeInt()
  at org.apache.lucene.codecs.CodecUtil#writeFooter()
  at 
org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs()
  at 
org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs()
  at 
org.apache.lucene.index.PendingDeletes#writeLiveDocs()
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13756) ivy cannot download org.restlet.ext.servlet jar

2020-01-28 Thread Zsolt Gyulavari (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025259#comment-17025259
 ] 

Zsolt Gyulavari commented on SOLR-13756:


I've rebased and addressed the gradle build as well, however I think the 
cloudera repo is not needed anymore if not for the backup purposes. Otherwise 
we can remove it altogether. What do you think?

> ivy cannot download org.restlet.ext.servlet jar
> ---
>
> Key: SOLR-13756
> URL: https://issues.apache.org/jira/browse/SOLR-13756
> Project: Solr
>  Issue Type: Bug
>Reporter: Chongchen Chen
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I checkout the project and run `ant idea`, it will try to download jars. But  
> https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar
>  will return 404 now.  
> [ivy:retrieve] public: tried
> [ivy:retrieve]  
> https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar
> [ivy:retrieve]::
> [ivy:retrieve]::  FAILED DOWNLOADS::
> [ivy:retrieve]:: ^ see resolution messages for details  ^ ::
> [ivy:retrieve]::
> [ivy:retrieve]:: 
> org.restlet.jee#org.restlet;2.3.0!org.restlet.jar
> [ivy:retrieve]:: 
> org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar
> [ivy:retrieve]::



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-9185.
-
Fix Version/s: master (9.0)
   Resolution: Fixed

> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
>

[jira] [Commented] (LUCENE-9185) add "tests.profile" to gradle build to aid fixing slow tests

2020-01-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025263#comment-17025263
 ] 

ASF subversion and git services commented on LUCENE-9185:
-

Commit e504798a44e5f1577d87ef3a43d9d1e3a859d68a in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e504798 ]

LUCENE-9185: add "tests.profile" to gradle build to aid fixing slow tests

Run test(s) with -Ptests.profile=true to print a histogram at the end of
the build.


> add "tests.profile" to gradle build to aid fixing slow tests
> 
>
> Key: LUCENE-9185
> URL: https://issues.apache.org/jira/browse/LUCENE-9185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9185.patch, LUCENE-9185.patch, LUCENE-9185.patch
>
>
> It is kind of a hassle to profile slow tests to fix the bottlenecks
> The idea here is to make it dead easy to profile (just) the tests, capturing 
> samples at a very low granularity, reducing noise as much as possible (e.g. 
> not profiling entire gradle build or anything) and print a simple report for 
> quick iterating.
> Here's a prototype of what I hacked together:
> All of lucene core: {{./gradlew -p lucene/core test -Dtests.profile=true}}
> {noformat}
> ...
> PROFILE SUMMARY from 122464 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT SAMPLES STACK
> 2.59%   3170
> org.apache.lucene.util.compress.LZ4$HighCompressionHashTable#assertReset()
> 2.26%   2762java.util.Arrays#fill()
> 1.59%   1953com.carrotsearch.randomizedtesting.RandomizedContext#context()
> 1.24%   1523java.util.Random#nextInt()
> 1.19%   1456java.lang.StringUTF16#compress()
> 1.08%   1319java.lang.StringLatin1#inflate()
> 1.00%   1228java.lang.Integer#getChars()
> 0.99%   1214java.util.Arrays#compareUnsigned()
> 0.96%   1179java.util.zip.Inflater#inflateBytesBytes()
> 0.91%   1114java.util.concurrent.atomic.AtomicLong#compareAndSet()
> BUILD SUCCESSFUL in 3m 59s
> {noformat}
> If you look at this LZ4 assertReset method, you can see its indeed way too 
> expensive, checking 64K items every time.
> To dig deeper into potential problems you can pass additional parameters (all 
> of them used here for demonstration):
> {{./gradlew -p solr/core test --tests TestLRUStatsCache -Dtests.profile=true 
> -Dtests.profile.count=8 -Dtests.profile.stacksize=20 
> -Dtests.profile.linenumbers=true}}
> This clearly finds SOLR-14223 (expensive RSA key generation in CryptoKeys) ...
> {noformat}
> ...
> PROFILE SUMMARY from 21355 samples
>   tests.profile.count=8
>   tests.profile.stacksize=20
>   tests.profile.linenumbers=true
> PERCENT SAMPLES STACK
> 26.30%  5617sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#tryProduce():171
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produce():135
>   at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$235.1914126144#run():(Interpreted
>  code)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool#runJob():806
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner#run():938
>   at java.lang.Thread#run():830
> 16.19%  3458sun.nio.ch.EPoll#wait():(Native code)
>   at sun.nio.ch.EPollSelectorImpl#doSelect():120
>   at sun.nio.ch.SelectorImpl#lockAndDoSelect():124
>   at sun.nio.ch.SelectorImpl#select():141
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#select():472
>   at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer#produce():409
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#produceTask():360
>   at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill#doProduce():184
>   

[jira] [Commented] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025270#comment-17025270
 ] 

Robert Muir commented on LUCENE-9189:
-

I'm guessing there is something such as a copyBytes that goes one byte at a 
time or similar stuff causing it to be truly pathological.

> TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
> ---
>
> Key: LUCENE-9189
> URL: https://issues.apache.org/jira/browse/LUCENE-9189
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
>
> I thought it was just the testUpdatesOnDiskFull, but looks like this one 
> needs to be nightly too.
> Should look more into the test, but I know something causes it to make such 
> an insane amount of files, that sorting them becomes a bottleneck.
> I guess also related is that it would be great if MockDirectoryWrapper's disk 
> full check didn't trigger a sort of the files (via listAll): it does this 
> check on like every i/o, would be nice for it to be less absurd. Maybe 
> instead the test could check for disk full on not every i/o but some random 
> sample of them?
> Temporarily lets make it nightly...
> {noformat}
> PROFILE SUMMARY from 182501 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT   SAMPLES STACK
> 15.89%28995   java.lang.StringLatin1#compareTo()
> 6.61% 12069   java.util.TimSort#mergeHi()
> 5.96% 10878   java.util.TimSort#binarySort()
> 3.41% 6231java.util.concurrent.ConcurrentHashMap#tabAt()
> 2.98% 5433java.util.Comparators$NaturalOrderComparator#compare()
> 2.12% 3876org.apache.lucene.store.DataOutput#copyBytes()
> 2.03% 3712java.lang.String#compareTo()
> 1.84% 3350java.util.concurrent.ConcurrentHashMap#get()
> 1.83% 3337java.util.TimSort#mergeLo()
> 1.67% 3047java.util.ArrayList#add()
> {noformat}
> All the file sorting is called from stacks like this, so its literally 
> happening every writeByte() and so on
> {noformat}
> 0.73% 1329java.util.TimSort#binarySort()
> at java.util.TimSort#sort()
> at java.util.Arrays#sort()
> at java.util.ArrayList#sort()
> at java.util.stream.SortedOps$RefSortingSink#end()
> at java.util.stream.AbstractPipeline#copyInto()
> at java.util.stream.AbstractPipeline#wrapAndCopyInto()
> at java.util.stream.AbstractPipeline#evaluate()
> at 
> java.util.stream.AbstractPipeline#evaluateToArrayNode()
> at java.util.stream.ReferencePipeline#toArray()
> at 
> org.apache.lucene.store.ByteBuffersDirectory#listAll()
> at 
> org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeBytes()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeByte()
> at org.apache.lucene.store.DataOutput#writeInt()
> at org.apache.lucene.codecs.CodecUtil#writeFooter()
> at 
> org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs()
> at 
> org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs()
> at 
> org.apache.lucene.index.PendingDeletes#writeLiveDocs()
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9190) add dedicated test to assert internals of LZ4 hashtable

2020-01-28 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9190:
---

 Summary: add dedicated test to assert internals of LZ4 hashtable
 Key: LUCENE-9190
 URL: https://issues.apache.org/jira/browse/LUCENE-9190
 Project: Lucene - Core
  Issue Type: Task
Reporter: Robert Muir


This assert was called all the time by all tests, causing a bottleneck. I 
disabled it in LUCENE-9187, but it would be nice to add a subclass or 
package-private method or something to still test it (without taking up tons of 
cpu).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable

2020-01-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025276#comment-17025276
 ] 

ASF subversion and git services commented on LUCENE-9187:
-

Commit 4350efa932a4c6aaad1943857c935bafce98fe56 in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4350efa ]

LUCENE-9187: remove too-expensive assert from LZ4 HighCompressionHashTable


> remove too-expensive assert from LZ4 HighCompressionHashTable
> -
>
> Key: LUCENE-9187
> URL: https://issues.apache.org/jira/browse/LUCENE-9187
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9187.patch
>
>
> This is the slowest method in the lucene tests. See LUCENE-9185 for what I 
> mean.
> If you look at it, its checking 64k values every time the assert is called.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9187) remove too-expensive assert from LZ4 HighCompressionHashTable

2020-01-28 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-9187.
-
Fix Version/s: master (9.0)
   Resolution: Fixed

I opened LUCENE-9190 as a followup for the dedicated test idea so we don't lose 
it.

> remove too-expensive assert from LZ4 HighCompressionHashTable
> -
>
> Key: LUCENE-9187
> URL: https://issues.apache.org/jira/browse/LUCENE-9187
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9187.patch
>
>
> This is the slowest method in the lucene tests. See LUCENE-9185 for what I 
> mean.
> If you look at it, its checking 64k values every time the assert is called.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9191) Fix linefiledocs compression or replace in tests

2020-01-28 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9191:
---

 Summary: Fix linefiledocs compression or replace in tests
 Key: LUCENE-9191
 URL: https://issues.apache.org/jira/browse/LUCENE-9191
 Project: Lucene - Core
  Issue Type: Task
Reporter: Robert Muir


LineFileDocs(random) is very slow, even to open. It does a very slow "random 
skip" through a gzip compressed file.

For the analyzers tests, in LUCENE-9186 I simply removed its usage, since 
TestUtil.randomAnalysisString is superior, and fast. But we should address 
other tests using it, since LineFileDocs(random) is slow!

I think it is also the case that every lucene test has probably tested every 
LineFileDocs line many times now, whereas randomAnalysisString will invent new 
ones.

Alternatively, we could "fix" LineFileDocs(random), e.g. special compression 
options (in blocks)... deflate supports such stuff. But it would make it even 
hairier than it is now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] balaji-s opened a new pull request #1221: SOLR-14193 Update tutorial.adoc(line no:664) so that command executes…

2020-01-28 Thread GitBox
balaji-s opened a new pull request #1221: SOLR-14193 Update tutorial.adoc(line 
no:664) so that command executes…
URL: https://github.com/apache/lucene-solr/pull/1221
 
 
   … in windows enviroment
   
   
   
   
   # Description
   
   Please provide a short description of the changes you're making with this 
pull request.
   
   # Solution
   
   Please provide a short description of the approach taken to implement your 
solution.
   
   # Tests
   
   Please describe the tests you've developed or run to confirm this patch 
implements the feature or solves the problem.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ ] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-01-28 Thread Gregg Donovan (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025286#comment-17025286
 ] 

Gregg Donovan commented on SOLR-13289:
--

{quote}This feature currently doesn't work in case of faceting(this is 
expected), grouping.{quote}

Will WAND cause faceting to break entirely? Or will the counts for facets just 
be inexact?

{quote}as same minExactHits is shared across shard. so, actual minExactHits is 
shardCount*minExactHits{quote}
Perhaps it would be worth having an additional parameter for a 
perShardExactHits? E.g. if we're requesting the top 1000 hits across 64 shards, 
we'd likely be fine with WAND getting the top, say, 150 per shard.


> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9186) remove linefiledocs usage from basetokenstreamtestcase

2020-01-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025289#comment-17025289
 ] 

ASF subversion and git services commented on LUCENE-9186:
-

Commit 3bcc97c8eb70f4a3a309d4cdab290363b524b0a2 in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3bcc97c ]

LUCENE-9186: remove linefiledocs usage from BaseTokenStreamTestCase


> remove linefiledocs usage from basetokenstreamtestcase
> --
>
> Key: LUCENE-9186
> URL: https://issues.apache.org/jira/browse/LUCENE-9186
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/test
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9186.patch
>
>
> LineFileDocs is slow, even to open. That's because it (very slowly) "skips" 
> to a pseudorandom position into a 5MB gzip stream when you open it.
> There was a time when we didn't have a nice string generator for tests 
> (TestUtil.randomAnalysisString), but now we do. And when it was introduced it 
> found interesting new things that linefiledocs never found.
> This speeds up all the analyzer tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9186) remove linefiledocs usage from basetokenstreamtestcase

2020-01-28 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-9186.
-
Fix Version/s: master (9.0)
   Resolution: Fixed

I opened LUCENE-9191 as a followup for other tests using LineFileDocs in a 
similar way. But fixing the analyzers tests was an easy win.

> remove linefiledocs usage from basetokenstreamtestcase
> --
>
> Key: LUCENE-9186
> URL: https://issues.apache.org/jira/browse/LUCENE-9186
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/test
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9186.patch
>
>
> LineFileDocs is slow, even to open. That's because it (very slowly) "skips" 
> to a pseudorandom position into a 5MB gzip stream when you open it.
> There was a time when we didn't have a nice string generator for tests 
> (TestUtil.randomAnalysisString), but now we do. And when it was introduced it 
> found interesting new things that linefiledocs never found.
> This speeds up all the analyzer tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14193) Update tutorial.adoc(line no:664) so that command executes in windows enviroment

2020-01-28 Thread balaji sundaram (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

balaji sundaram updated SOLR-14193:
---
Attachment: solr-tutorial.adoc
Status: Open  (was: Open)

> Update tutorial.adoc(line no:664) so that command executes in windows 
> enviroment
> 
>
> Key: SOLR-14193
> URL: https://issues.apache.org/jira/browse/SOLR-14193
> Project: Solr
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 8.4
>Reporter: balaji sundaram
>Priority: Minor
> Attachments: solr-tutorial.adoc
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> {{When executing the following command in windows 10 "java -jar -Dc=films 
> -Dparams=f.genre.split=true&f.directed_by.split=true&f.genre.separator=|&f.directed_by.separator=|
>  -Dauto example\exampledocs\post.jar example\films\*.csv", it throws error "& 
> was unexpected at this time."}}
> Fix: the command should escape "&" and "|" symbol{{}}
> {{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] balaji-s commented on issue #1221: SOLR-14193 Update tutorial.adoc(line no:664) so that command executes…

2020-01-28 Thread GitBox
balaji-s commented on issue #1221: SOLR-14193 Update tutorial.adoc(line no:664) 
so that command executes…
URL: https://github.com/apache/lucene-solr/pull/1221#issuecomment-579353263
 
 
   Updated line no:664 in solr-tutorial.adoc. Added escape characters for  ^ 
and | symbols in windows environment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13817) Deprecate and remove legacy SolrCache implementations

2020-01-28 Thread Andy Webb (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025295#comment-17025295
 ] 

Andy Webb commented on SOLR-13817:
--

Could I put in a request that we get to use the final version of CaffeineCache 
in 8.5.0+ before the legacy cache implementations are removed in 9.0.0 please?

Currently 
https://github.com/apache/lucene-solr/commit/b4fe911cc8e4bddff18226bc8c98a2deb735a8fc#diff-fc056ba10fcf92dc69fe32991cdad5f0
 (in master) both updates CaffeineCache.java and removes FastLRUCache etc.

thanks,
Andy

> Deprecate and remove legacy SolrCache implementations
> -
>
> Key: SOLR-13817
> URL: https://issues.apache.org/jira/browse/SOLR-13817
> Project: Solr
>  Issue Type: Improvement
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: SOLR-13817-8x.patch, SOLR-13817-master.patch
>
>
> Now that SOLR-8241 has been committed I propose to deprecate other cache 
> implementations in 8x and remove them altogether from 9.0, in order to reduce 
> confusion and maintenance costs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025297#comment-17025297
 ] 

Robert Muir commented on LUCENE-9189:
-

There are definitely test bugs here too. MockDirectoryWrapper shouldn't even be 
checking disk full here, it wasn't told to do so! So its copyBytes is bad, as 
it unconditionally does the expensive disk full check on every invocation (even 
if setTrackDiskUsage was never called, such as this test).

So we definitely need to fix it to only check for disk full if the test asked 
for it, and then fix tests that want to test disk full to 
.setTrackDiskUsage(true).

I'm looking in.

> TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
> ---
>
> Key: LUCENE-9189
> URL: https://issues.apache.org/jira/browse/LUCENE-9189
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
>
> I thought it was just the testUpdatesOnDiskFull, but looks like this one 
> needs to be nightly too.
> Should look more into the test, but I know something causes it to make such 
> an insane amount of files, that sorting them becomes a bottleneck.
> I guess also related is that it would be great if MockDirectoryWrapper's disk 
> full check didn't trigger a sort of the files (via listAll): it does this 
> check on like every i/o, would be nice for it to be less absurd. Maybe 
> instead the test could check for disk full on not every i/o but some random 
> sample of them?
> Temporarily lets make it nightly...
> {noformat}
> PROFILE SUMMARY from 182501 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT   SAMPLES STACK
> 15.89%28995   java.lang.StringLatin1#compareTo()
> 6.61% 12069   java.util.TimSort#mergeHi()
> 5.96% 10878   java.util.TimSort#binarySort()
> 3.41% 6231java.util.concurrent.ConcurrentHashMap#tabAt()
> 2.98% 5433java.util.Comparators$NaturalOrderComparator#compare()
> 2.12% 3876org.apache.lucene.store.DataOutput#copyBytes()
> 2.03% 3712java.lang.String#compareTo()
> 1.84% 3350java.util.concurrent.ConcurrentHashMap#get()
> 1.83% 3337java.util.TimSort#mergeLo()
> 1.67% 3047java.util.ArrayList#add()
> {noformat}
> All the file sorting is called from stacks like this, so its literally 
> happening every writeByte() and so on
> {noformat}
> 0.73% 1329java.util.TimSort#binarySort()
> at java.util.TimSort#sort()
> at java.util.Arrays#sort()
> at java.util.ArrayList#sort()
> at java.util.stream.SortedOps$RefSortingSink#end()
> at java.util.stream.AbstractPipeline#copyInto()
> at java.util.stream.AbstractPipeline#wrapAndCopyInto()
> at java.util.stream.AbstractPipeline#evaluate()
> at 
> java.util.stream.AbstractPipeline#evaluateToArrayNode()
> at java.util.stream.ReferencePipeline#toArray()
> at 
> org.apache.lucene.store.ByteBuffersDirectory#listAll()
> at 
> org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeBytes()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeByte()
> at org.apache.lucene.store.DataOutput#writeInt()
> at org.apache.lucene.codecs.CodecUtil#writeFooter()
> at 
> org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs()
> at 
> org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs()
> at 
> org.apache.lucene.index.PendingDeletes#writeLiveDocs()
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes

2020-01-28 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025303#comment-17025303
 ] 

Robert Muir commented on LUCENE-9189:
-

OK, I see the issue. it also "tracks" (by track we mean, recomputes by calling 
listAll and then summing fileLength of every file... on every writeByte etc) 
the disk usage if you setMaxSizeInBytes. 

So it only impacts these disk full tests. The tracking should get more 
efficient, but the scope is limited and I don't want to wrestle with this logic 
right now. Going with Nightly until we fix the efficiency of this thing.

> TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
> ---
>
> Key: LUCENE-9189
> URL: https://issues.apache.org/jira/browse/LUCENE-9189
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
>
> I thought it was just the testUpdatesOnDiskFull, but looks like this one 
> needs to be nightly too.
> Should look more into the test, but I know something causes it to make such 
> an insane amount of files, that sorting them becomes a bottleneck.
> I guess also related is that it would be great if MockDirectoryWrapper's disk 
> full check didn't trigger a sort of the files (via listAll): it does this 
> check on like every i/o, would be nice for it to be less absurd. Maybe 
> instead the test could check for disk full on not every i/o but some random 
> sample of them?
> Temporarily lets make it nightly...
> {noformat}
> PROFILE SUMMARY from 182501 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT   SAMPLES STACK
> 15.89%28995   java.lang.StringLatin1#compareTo()
> 6.61% 12069   java.util.TimSort#mergeHi()
> 5.96% 10878   java.util.TimSort#binarySort()
> 3.41% 6231java.util.concurrent.ConcurrentHashMap#tabAt()
> 2.98% 5433java.util.Comparators$NaturalOrderComparator#compare()
> 2.12% 3876org.apache.lucene.store.DataOutput#copyBytes()
> 2.03% 3712java.lang.String#compareTo()
> 1.84% 3350java.util.concurrent.ConcurrentHashMap#get()
> 1.83% 3337java.util.TimSort#mergeLo()
> 1.67% 3047java.util.ArrayList#add()
> {noformat}
> All the file sorting is called from stacks like this, so its literally 
> happening every writeByte() and so on
> {noformat}
> 0.73% 1329java.util.TimSort#binarySort()
> at java.util.TimSort#sort()
> at java.util.Arrays#sort()
> at java.util.ArrayList#sort()
> at java.util.stream.SortedOps$RefSortingSink#end()
> at java.util.stream.AbstractPipeline#copyInto()
> at java.util.stream.AbstractPipeline#wrapAndCopyInto()
> at java.util.stream.AbstractPipeline#evaluate()
> at 
> java.util.stream.AbstractPipeline#evaluateToArrayNode()
> at java.util.stream.ReferencePipeline#toArray()
> at 
> org.apache.lucene.store.ByteBuffersDirectory#listAll()
> at 
> org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeBytes()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeByte()
> at org.apache.lucene.store.DataOutput#writeInt()
> at org.apache.lucene.codecs.CodecUtil#writeFooter()
> at 
> org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs()
> at 
> org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs()
> at 
> org.apache.lucene.index.PendingDeletes#writeLiveDocs()
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes

2020-01-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025307#comment-17025307
 ] 

ASF subversion and git services commented on LUCENE-9189:
-

Commit 4773574578f089802fe3f36bff6951c4a29a3628 in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4773574 ]

LUCENE-9189: TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes

The issue is that MockDirectoryWrapper's disk full check is horribly
inefficient. On every writeByte/etc, it totally recomputes disk space
across all files. This means it calls listAll() on the underlying
Directory (which sorts all the underlying files), then sums up fileLength()
for each of those files.

This leads to many pathological cases in the disk full tests... but the
number of tests impacted by this is minimal, and the logic is scary.


> TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
> ---
>
> Key: LUCENE-9189
> URL: https://issues.apache.org/jira/browse/LUCENE-9189
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
>
> I thought it was just the testUpdatesOnDiskFull, but looks like this one 
> needs to be nightly too.
> Should look more into the test, but I know something causes it to make such 
> an insane amount of files, that sorting them becomes a bottleneck.
> I guess also related is that it would be great if MockDirectoryWrapper's disk 
> full check didn't trigger a sort of the files (via listAll): it does this 
> check on like every i/o, would be nice for it to be less absurd. Maybe 
> instead the test could check for disk full on not every i/o but some random 
> sample of them?
> Temporarily lets make it nightly...
> {noformat}
> PROFILE SUMMARY from 182501 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT   SAMPLES STACK
> 15.89%28995   java.lang.StringLatin1#compareTo()
> 6.61% 12069   java.util.TimSort#mergeHi()
> 5.96% 10878   java.util.TimSort#binarySort()
> 3.41% 6231java.util.concurrent.ConcurrentHashMap#tabAt()
> 2.98% 5433java.util.Comparators$NaturalOrderComparator#compare()
> 2.12% 3876org.apache.lucene.store.DataOutput#copyBytes()
> 2.03% 3712java.lang.String#compareTo()
> 1.84% 3350java.util.concurrent.ConcurrentHashMap#get()
> 1.83% 3337java.util.TimSort#mergeLo()
> 1.67% 3047java.util.ArrayList#add()
> {noformat}
> All the file sorting is called from stacks like this, so its literally 
> happening every writeByte() and so on
> {noformat}
> 0.73% 1329java.util.TimSort#binarySort()
> at java.util.TimSort#sort()
> at java.util.Arrays#sort()
> at java.util.ArrayList#sort()
> at java.util.stream.SortedOps$RefSortingSink#end()
> at java.util.stream.AbstractPipeline#copyInto()
> at java.util.stream.AbstractPipeline#wrapAndCopyInto()
> at java.util.stream.AbstractPipeline#evaluate()
> at 
> java.util.stream.AbstractPipeline#evaluateToArrayNode()
> at java.util.stream.ReferencePipeline#toArray()
> at 
> org.apache.lucene.store.ByteBuffersDirectory#listAll()
> at 
> org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeBytes()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeByte()
> at org.apache.lucene.store.DataOutput#writeInt()
> at org.apache.lucene.codecs.CodecUtil#writeFooter()
> at 
> org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs()
> at 
> org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs()
> at 
> org.apache.lucene.index.PendingDeletes#writeLiveDocs()
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9189) TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes

2020-01-28 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-9189.
-
Fix Version/s: master (9.0)
   Resolution: Fixed

As mentioned above, I marked nightly for now. I need to go to the beer store if 
I'm gonna touch MockDirectoryWrapper...

> TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
> ---
>
> Key: LUCENE-9189
> URL: https://issues.apache.org/jira/browse/LUCENE-9189
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
>
> I thought it was just the testUpdatesOnDiskFull, but looks like this one 
> needs to be nightly too.
> Should look more into the test, but I know something causes it to make such 
> an insane amount of files, that sorting them becomes a bottleneck.
> I guess also related is that it would be great if MockDirectoryWrapper's disk 
> full check didn't trigger a sort of the files (via listAll): it does this 
> check on like every i/o, would be nice for it to be less absurd. Maybe 
> instead the test could check for disk full on not every i/o but some random 
> sample of them?
> Temporarily lets make it nightly...
> {noformat}
> PROFILE SUMMARY from 182501 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT   SAMPLES STACK
> 15.89%28995   java.lang.StringLatin1#compareTo()
> 6.61% 12069   java.util.TimSort#mergeHi()
> 5.96% 10878   java.util.TimSort#binarySort()
> 3.41% 6231java.util.concurrent.ConcurrentHashMap#tabAt()
> 2.98% 5433java.util.Comparators$NaturalOrderComparator#compare()
> 2.12% 3876org.apache.lucene.store.DataOutput#copyBytes()
> 2.03% 3712java.lang.String#compareTo()
> 1.84% 3350java.util.concurrent.ConcurrentHashMap#get()
> 1.83% 3337java.util.TimSort#mergeLo()
> 1.67% 3047java.util.ArrayList#add()
> {noformat}
> All the file sorting is called from stacks like this, so its literally 
> happening every writeByte() and so on
> {noformat}
> 0.73% 1329java.util.TimSort#binarySort()
> at java.util.TimSort#sort()
> at java.util.Arrays#sort()
> at java.util.ArrayList#sort()
> at java.util.stream.SortedOps$RefSortingSink#end()
> at java.util.stream.AbstractPipeline#copyInto()
> at java.util.stream.AbstractPipeline#wrapAndCopyInto()
> at java.util.stream.AbstractPipeline#evaluate()
> at 
> java.util.stream.AbstractPipeline#evaluateToArrayNode()
> at java.util.stream.ReferencePipeline#toArray()
> at 
> org.apache.lucene.store.ByteBuffersDirectory#listAll()
> at 
> org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeBytes()
> at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeByte()
> at org.apache.lucene.store.DataOutput#writeInt()
> at org.apache.lucene.codecs.CodecUtil#writeFooter()
> at 
> org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs()
> at 
> org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs()
> at 
> org.apache.lucene.index.PendingDeletes#writeLiveDocs()
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?

2020-01-28 Thread Michael Froh (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025313#comment-17025313
 ] 

Michael Froh commented on LUCENE-8962:
--

Thanks [~msoko...@gmail.com] for the feedback on the PR! I've updated it to 
incorporate your suggestions.

> Can we merge small segments during refresh, for faster searching?
> -
>
> Key: LUCENE-8962
> URL: https://issues.apache.org/jira/browse/LUCENE-8962
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Michael McCandless
>Priority: Major
> Attachments: LUCENE-8962_demo.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> With near-real-time search we ask {{IndexWriter}} to write all in-memory 
> segments to disk and open an {{IndexReader}} to search them, and this is 
> typically a quick operation.
> However, when you use many threads for concurrent indexing, {{IndexWriter}} 
> will accumulate write many small segments during {{refresh}} and this then 
> adds search-time cost as searching must visit all of these tiny segments.
> The merge policy would normally quickly coalesce these small segments if 
> given a little time ... so, could we somehow improve {{IndexWriter'}}s 
> refresh to optionally kick off merge policy to merge segments below some 
> threshold before opening the near-real-time reader?  It'd be a bit tricky 
> because while we are waiting for merges, indexing may continue, and new 
> segments may be flushed, but those new segments shouldn't be included in the 
> point-in-time segments returned by refresh ...
> One could almost do this on top of Lucene today, with a custom merge policy, 
> and some hackity logic to have the merge policy target small segments just 
> written by refresh, but it's tricky to then open a near-real-time reader, 
> excluding newly flushed but including newly merged segments since the refresh 
> originally finished ...
> I'm not yet sure how best to solve this, so I wanted to open an issue for 
> discussion!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov commented on a change in pull request #1155: LUCENE-8962: Add ability to selectively merge on commit

2020-01-28 Thread GitBox
msokolov commented on a change in pull request #1155: LUCENE-8962: Add ability 
to selectively merge on commit
URL: https://github.com/apache/lucene-solr/pull/1155#discussion_r371953236
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
 ##
 @@ -3223,15 +3259,44 @@ private long prepareCommitInternal() throws 
IOException {
   // sneak into the commit point:
   toCommit = segmentInfos.clone();
 
+  if (anyChanges) {
+mergeAwaitLatchRef = new AtomicReference<>();
+MergePolicy mergeOnCommitPolicy = 
waitForMergeOnCommitPolicy(config.getMergePolicy(), toCommit, 
mergeAwaitLatchRef);
+
+// Find any merges that can execute on commit (per 
MergePolicy).
+commitMerges = 
mergeOnCommitPolicy.findCommitMerges(segmentInfos, this);
+if (commitMerges != null && commitMerges.merges.size() > 0) {
+  int mergeCount = 0;
+  for (MergePolicy.OneMerge oneMerge : commitMerges.merges) {
+if (registerMerge(oneMerge)) {
+  mergeCount++;
+} else {
+  throw new IllegalStateException("MergePolicy " + 
config.getMergePolicy().getClass() +
 
 Review comment:
   I see, thanks for explaining!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4702) Terms dictionary compression

2020-01-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025321#comment-17025321
 ] 

ASF subversion and git services commented on LUCENE-4702:
-

Commit 6eb8834a57fa176c6c2e995480b69ecea1b6bd07 in lucene-solr's branch 
refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6eb8834 ]

LUCENE-4702: Reduce terms dictionary compression overhead. (#1216)

Changes include:
 - Removed LZ4 compression of suffix lengths which didn't save much space
   anyway.
 - For stats, LZ4 was only really used for run-length compression of terms whose
   docFreq is 1. This has been replaced by explicit run-length compression.
 - Since we only use LZ4 for suffix bytes if the compression ration is < 75%, we
   now only try LZ4 out if the average suffix length is greater than 6, in order
   to reduce index-time overhead.

> Terms dictionary compression
> 
>
> Key: LUCENE-4702
> URL: https://issues.apache.org/jira/browse/LUCENE-4702
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Trivial
> Attachments: LUCENE-4702.patch, LUCENE-4702.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> I've done a quick test with the block tree terms dictionary by replacing a 
> call to IndexOutput.writeBytes to write suffix bytes with a call to 
> LZ4.compressHC to test the peformance hit. Interestingly, search performance 
> was very good (see comparison table below) and the tim files were 14% smaller 
> (from 150432 bytes overall to 129516).
> {noformat}
> TaskQPS baseline  StdDevQPS compressed  StdDev
> Pct diff
>   Fuzzy1  111.50  (2.0%)   78.78  (1.5%)  
> -29.4% ( -32% -  -26%)
>   Fuzzy2   36.99  (2.7%)   28.59  (1.5%)  
> -22.7% ( -26% -  -18%)
>  Respell  122.86  (2.1%)  103.89  (1.7%)  
> -15.4% ( -18% -  -11%)
> Wildcard  100.58  (4.3%)   94.42  (3.2%)   
> -6.1% ( -13% -1%)
>  Prefix3  124.90  (5.7%)  122.67  (4.7%)   
> -1.8% ( -11% -9%)
>OrHighLow  169.87  (6.8%)  167.77  (8.0%)   
> -1.2% ( -15% -   14%)
>  LowTerm 1949.85  (4.5%) 1929.02  (3.4%)   
> -1.1% (  -8% -7%)
>   AndHighLow 2011.95  (3.5%) 1991.85  (3.3%)   
> -1.0% (  -7% -5%)
>   OrHighHigh  155.63  (6.7%)  154.12  (7.9%)   
> -1.0% ( -14% -   14%)
>  AndHighHigh  341.82  (1.2%)  339.49  (1.7%)   
> -0.7% (  -3% -2%)
>OrHighMed  217.55  (6.3%)  216.16  (7.1%)   
> -0.6% ( -13% -   13%)
>   IntNRQ   53.10 (10.9%)   52.90  (8.6%)   
> -0.4% ( -17% -   21%)
>  MedTerm  998.11  (3.8%)  994.82  (5.6%)   
> -0.3% (  -9% -9%)
>  MedSpanNear   60.50  (3.7%)   60.36  (4.8%)   
> -0.2% (  -8% -8%)
> HighSpanNear   19.74  (4.5%)   19.72  (5.1%)   
> -0.1% (  -9% -9%)
>  LowSpanNear  101.93  (3.2%)  101.82  (4.4%)   
> -0.1% (  -7% -7%)
>   AndHighMed  366.18  (1.7%)  366.93  (1.7%)
> 0.2% (  -3% -3%)
> PKLookup  237.28  (4.0%)  237.96  (4.2%)
> 0.3% (  -7% -8%)
>MedPhrase  173.17  (4.7%)  174.69  (4.7%)
> 0.9% (  -8% -   10%)
>  LowSloppyPhrase  180.91  (2.6%)  182.79  (2.7%)
> 1.0% (  -4% -6%)
>LowPhrase  374.64  (5.5%)  379.11  (5.8%)
> 1.2% (  -9% -   13%)
> HighTerm  253.14  (7.9%)  256.97 (11.4%)
> 1.5% ( -16% -   22%)
>   HighPhrase   19.52 (10.6%)   19.83 (11.0%)
> 1.6% ( -18% -   25%)
>  MedSloppyPhrase  141.90  (2.6%)  144.11  (2.5%)
> 1.6% (  -3% -6%)
> HighSloppyPhrase   25.26  (4.8%)   25.97  (5.0%)
> 2.8% (  -6% -   13%)
> {noformat}
> Only queries which are very terms-dictionary-intensive got a performance hit 
> (Fuzzy, Fuzzy2, Respell, Wildcard), other queries including Prefix3 behaved 
> (surprisingly) well.
> Do you think of it as something worth exploring?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz merged pull request #1216: LUCENE-4702: Reduce terms dictionary compression overhead.

2020-01-28 Thread GitBox
jpountz merged pull request #1216: LUCENE-4702: Reduce terms dictionary 
compression overhead.
URL: https://github.com/apache/lucene-solr/pull/1216
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz merged pull request #1197: LUCENE-9161: DirectMonotonicWriter checks for overflows.

2020-01-28 Thread GitBox
jpountz merged pull request #1197: LUCENE-9161: DirectMonotonicWriter checks 
for overflows.
URL: https://github.com/apache/lucene-solr/pull/1197
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9161) DirectMonotonicWriter should check for overflows

2020-01-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025344#comment-17025344
 ] 

ASF subversion and git services commented on LUCENE-9161:
-

Commit 92b684c647876c886ba71dab51edf6f1f3c59d82 in lucene-solr's branch 
refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=92b684c ]

LUCENE-9161: DirectMonotonicWriter checks for overflows. (#1197)



> DirectMonotonicWriter should check for overflows
> 
>
> Key: LUCENE-9161
> URL: https://issues.apache.org/jira/browse/LUCENE-9161
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> DirectMonotonicWriter doesn't verify that the provided blockShift is 
> compatible with the number of written values.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14226) SolrStream reports AuthN/AuthZ failures (401|403) as IOException w/o details

2020-01-28 Thread Chris M. Hostetter (Jira)
Chris M. Hostetter created SOLR-14226:
-

 Summary: SolrStream reports AuthN/AuthZ failures (401|403) as 
IOException w/o details
 Key: SOLR-14226
 URL: https://issues.apache.org/jira/browse/SOLR-14226
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ, streaming expressions
Reporter: Chris M. Hostetter


If you try to use the SolrJ {{SolrStream}} class to making a streaming 
expression request to a solr node, any authentication or authorization failures 
will be swallowed and a eneric "IOException" will be thrown.

(evidently due to a pars error trying to read the body of the response w/o 
consulting the HTTP status?)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14226) SolrStream reports AuthN/AuthZ failures (401|403) as IOException w/o details

2020-01-28 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025365#comment-17025365
 ] 

Chris M. Hostetter commented on SOLR-14226:
---

>From a test i'm in the process of trying to write...

{code:java}
  public void testEchoStreamFail() throws Exception {
final SolrStream solrStream = new SolrStream(solrUrl,
 params("qt", "/stream", 
"expr", "echo(hello 
world)"));
solrStream.setCredentials("bogus_user", "bogus_pass");
SolrException e = expectThrows(SolrException.class, () -> {
final List ignored = getTuples(solrStream);
  });
assertEquals(401, e.code());
  }

{code}

{noformat}
   [junit4]> Throwable #1: junit.framework.AssertionFailedError: Unexpected 
exception type, expected SolrException but got java.io.IOException: --> 
http://127.0.0.1:35337/solr/collection_x: An exception has occurred on the 
server, refer to server log for details.
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([F7287DED4A9F66CA:B866576F16986894]:0)
   [junit4]>at 
org.apache.lucene.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2752)
   [junit4]>at 
org.apache.lucene.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2740)
   [junit4]>at 
org.apache.solr.client.solrj.io.stream.CloudAuthStreamTest.testEchoStreamFail(CloudAuthStreamTest.java:208)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
   [junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
   [junit4]> Caused by: java.io.IOException: --> 
http://127.0.0.1:35337/solr/collection_x: An exception has occurred on the 
server, refer to server log for details.
   [junit4]>at 
org.apache.solr.client.solrj.io.stream.SolrStream.read(SolrStream.java:232)
   [junit4]>at 
org.apache.solr.client.solrj.io.stream.CloudAuthStreamTest.getTuples(CloudAuthStreamTest.java:221)
   [junit4]>at 
org.apache.solr.client.solrj.io.stream.CloudAuthStreamTest.lambda$testEchoStreamFail$3(CloudAuthStreamTest.java:209)
   [junit4]>at 
org.apache.lucene.util.LuceneTestCase._expectThrows(LuceneTestCase.java:2870)
   [junit4]>at 
org.apache.lucene.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2745)
   [junit4]>... 41 more
   [junit4]> Caused by: org.noggit.JSONParser$ParseException: JSON Parse 
Error: char=<,position=0 AFTER='<' BEFORE='html>  https://issues.apache.org/jira/browse/SOLR-14226
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ, streaming expressions
>Reporter: Chris M. Hostetter
>Priority: Major
>
> If you try to use the SolrJ {{SolrStream}} class to making a streaming 
> expression request to a solr node, any authentication or authorization 
> failures will be swallowed and a eneric "IOException" will be thrown.
> (evidently due to a pars error trying to read the body of the response w/o 
> consulting the HTTP status?)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?

2020-01-28 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025367#comment-17025367
 ] 

David Smiley commented on LUCENE-8962:
--

[~msfroh] as you can see above, I accomplished the effect here already in a 
different way without modifying Lucene.  Not that I think we shouldn't modify 
Lucene altogether but I think the changes can be limited to _implementations 
of_ MergePolicy & MergeScheduler without needing to modify the abstractions 
themselves or core Lucene, which are already sufficient.  See LUCENE-8331 for a 
benchmark utility.  I should resume this work.

> Can we merge small segments during refresh, for faster searching?
> -
>
> Key: LUCENE-8962
> URL: https://issues.apache.org/jira/browse/LUCENE-8962
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Michael McCandless
>Priority: Major
> Attachments: LUCENE-8962_demo.png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> With near-real-time search we ask {{IndexWriter}} to write all in-memory 
> segments to disk and open an {{IndexReader}} to search them, and this is 
> typically a quick operation.
> However, when you use many threads for concurrent indexing, {{IndexWriter}} 
> will accumulate write many small segments during {{refresh}} and this then 
> adds search-time cost as searching must visit all of these tiny segments.
> The merge policy would normally quickly coalesce these small segments if 
> given a little time ... so, could we somehow improve {{IndexWriter'}}s 
> refresh to optionally kick off merge policy to merge segments below some 
> threshold before opening the near-real-time reader?  It'd be a bit tricky 
> because while we are waiting for merges, indexing may continue, and new 
> segments may be flushed, but those new segments shouldn't be included in the 
> point-in-time segments returned by refresh ...
> One could almost do this on top of Lucene today, with a custom merge policy, 
> and some hackity logic to have the merge policy target small segments just 
> written by refresh, but it's tricky to then open a near-real-time reader, 
> excluding newly flushed but including newly merged segments since the refresh 
> originally finished ...
> I'm not yet sure how best to solve this, so I wanted to open an issue for 
> discussion!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4702) Terms dictionary compression

2020-01-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025370#comment-17025370
 ] 

ASF subversion and git services commented on LUCENE-4702:
-

Commit 033220e2ab31494054b26c236be4b43b777aea02 in lucene-solr's branch 
refs/heads/branch_8x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=033220e ]

LUCENE-4702: Reduce terms dictionary compression overhead. (#1216)

Changes include:
 - Removed LZ4 compression of suffix lengths which didn't save much space
   anyway.
 - For stats, LZ4 was only really used for run-length compression of terms whose
   docFreq is 1. This has been replaced by explicit run-length compression.
 - Since we only use LZ4 for suffix bytes if the compression ration is < 75%, we
   now only try LZ4 out if the average suffix length is greater than 6, in order
   to reduce index-time overhead.


> Terms dictionary compression
> 
>
> Key: LUCENE-4702
> URL: https://issues.apache.org/jira/browse/LUCENE-4702
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Trivial
> Attachments: LUCENE-4702.patch, LUCENE-4702.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> I've done a quick test with the block tree terms dictionary by replacing a 
> call to IndexOutput.writeBytes to write suffix bytes with a call to 
> LZ4.compressHC to test the peformance hit. Interestingly, search performance 
> was very good (see comparison table below) and the tim files were 14% smaller 
> (from 150432 bytes overall to 129516).
> {noformat}
> TaskQPS baseline  StdDevQPS compressed  StdDev
> Pct diff
>   Fuzzy1  111.50  (2.0%)   78.78  (1.5%)  
> -29.4% ( -32% -  -26%)
>   Fuzzy2   36.99  (2.7%)   28.59  (1.5%)  
> -22.7% ( -26% -  -18%)
>  Respell  122.86  (2.1%)  103.89  (1.7%)  
> -15.4% ( -18% -  -11%)
> Wildcard  100.58  (4.3%)   94.42  (3.2%)   
> -6.1% ( -13% -1%)
>  Prefix3  124.90  (5.7%)  122.67  (4.7%)   
> -1.8% ( -11% -9%)
>OrHighLow  169.87  (6.8%)  167.77  (8.0%)   
> -1.2% ( -15% -   14%)
>  LowTerm 1949.85  (4.5%) 1929.02  (3.4%)   
> -1.1% (  -8% -7%)
>   AndHighLow 2011.95  (3.5%) 1991.85  (3.3%)   
> -1.0% (  -7% -5%)
>   OrHighHigh  155.63  (6.7%)  154.12  (7.9%)   
> -1.0% ( -14% -   14%)
>  AndHighHigh  341.82  (1.2%)  339.49  (1.7%)   
> -0.7% (  -3% -2%)
>OrHighMed  217.55  (6.3%)  216.16  (7.1%)   
> -0.6% ( -13% -   13%)
>   IntNRQ   53.10 (10.9%)   52.90  (8.6%)   
> -0.4% ( -17% -   21%)
>  MedTerm  998.11  (3.8%)  994.82  (5.6%)   
> -0.3% (  -9% -9%)
>  MedSpanNear   60.50  (3.7%)   60.36  (4.8%)   
> -0.2% (  -8% -8%)
> HighSpanNear   19.74  (4.5%)   19.72  (5.1%)   
> -0.1% (  -9% -9%)
>  LowSpanNear  101.93  (3.2%)  101.82  (4.4%)   
> -0.1% (  -7% -7%)
>   AndHighMed  366.18  (1.7%)  366.93  (1.7%)
> 0.2% (  -3% -3%)
> PKLookup  237.28  (4.0%)  237.96  (4.2%)
> 0.3% (  -7% -8%)
>MedPhrase  173.17  (4.7%)  174.69  (4.7%)
> 0.9% (  -8% -   10%)
>  LowSloppyPhrase  180.91  (2.6%)  182.79  (2.7%)
> 1.0% (  -4% -6%)
>LowPhrase  374.64  (5.5%)  379.11  (5.8%)
> 1.2% (  -9% -   13%)
> HighTerm  253.14  (7.9%)  256.97 (11.4%)
> 1.5% ( -16% -   22%)
>   HighPhrase   19.52 (10.6%)   19.83 (11.0%)
> 1.6% ( -18% -   25%)
>  MedSloppyPhrase  141.90  (2.6%)  144.11  (2.5%)
> 1.6% (  -3% -6%)
> HighSloppyPhrase   25.26  (4.8%)   25.97  (5.0%)
> 2.8% (  -6% -   13%)
> {noformat}
> Only queries which are very terms-dictionary-intensive got a performance hit 
> (Fuzzy, Fuzzy2, Respell, Wildcard), other queries including Prefix3 behaved 
> (surprisingly) well.
> Do you think of it as something worth exploring?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



  1   2   >