vsop-479 commented on PR #11888:
URL: https://github.com/apache/lucene/pull/11888#issuecomment-2033469573
Glad to know that. Thanks @mikemccand .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t
vigyasharma commented on issue #13226:
URL: https://github.com/apache/lucene/issues/13226#issuecomment-2033393778
For the segment `_1pa38`, can you also share it's details from before
setting `max_merged_segment` to 3gb, i.e. when it was not getting picked up for
merge? The difference can h
vigyasharma commented on issue #13226:
URL: https://github.com/apache/lucene/issues/13226#issuecomment-2033392249
`TieredMergePolicy` prefers merges that have less skew across segment sizes,
smaller size, and higher no. of expunged deletes. Each merge here is a set of
segments that will be
benwtrent commented on PR #13258:
URL: https://github.com/apache/lucene/pull/13258#issuecomment-2033193932
@mikemccand ^ This should fix that build failure. I am guessing for every
line in `version.txt` if the appropriate BWC index isn't found, its assumed to
be unsupported, and so it looks
benwtrent opened a new pull request, #13258:
URL: https://github.com/apache/lucene/pull/13258
Adds bwc indices for 9.10.1 for the 9x branch.
All I did was run:
```
./gradlew :lucene:backward-codecs:test -Ptests.useSecurityManager=false
--tests TestGenerateBwcIndices
```
benwtrent closed pull request #13257: Add bwc indices
URL: https://github.com/apache/lucene/pull/13257
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: i
benwtrent opened a new pull request, #13257:
URL: https://github.com/apache/lucene/pull/13257
Adds bwc indices for 9.10.1 for the 9x branch.
All I did was run:
```
./gradlew :lucene:backward-codecs:test -Ptests.useSecurityManager=false
--tests TestGenerateBwcIndices
```
msfroh commented on issue #13188:
URL: https://github.com/apache/lucene/issues/13188#issuecomment-2033117183
I wonder if we could think of this more broadly as a caching problem.
Basically, you could evaluate some "question" (aggregations, statistics,
etc.) for all segments and save t
benwtrent commented on issue #12627:
URL: https://github.com/apache/lucene/issues/12627#issuecomment-2033080335
@msokolov in the HNSW codec, we do something like this already when
gathering the underlying graph.
I would do something like:
```
if (FilterLeafReader.unwrap(ct
msokolov commented on issue #12627:
URL: https://github.com/apache/lucene/issues/12627#issuecomment-2033044934
I struggle to make this work though since the changes to make everything
more typesafe have also made the interesting bits inaccessible. EG I thought of
adding something like this:
benwtrent commented on PR #13200:
URL: https://github.com/apache/lucene/pull/13200#issuecomment-2032971207
@uschindler
> We can remove all old classes then and just adopt reader code for the old
codec to make it able to read the old byte values as identifier for
similarities and jus
vigyasharma commented on PR #13143:
URL: https://github.com/apache/lucene/pull/13143#issuecomment-2032786496
I was looking at this PR since it's marked as stale by our bots. I see that
we've already added the [fallback to exact
search](https://github.com/apache/lucene/blob/main/lucene/core/
vigyasharma commented on PR #13202:
URL: https://github.com/apache/lucene/pull/13202#issuecomment-2032687387
> > `TimeLimitingBulkScorer` already optimizes for timeout check frequency
outside of `QueryTimeout` impl
>
> Ahh nice catch! You mean something like:
>
> ```java
> /
benwtrent merged PR #13197:
URL: https://github.com/apache/lucene/pull/13197
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
benwtrent commented on code in PR #13200:
URL: https://github.com/apache/lucene/pull/13200#discussion_r1548282620
##
lucene/core/src/java/org/apache/lucene/codecs/ByteVectorProvider.java:
##
@@ -33,6 +34,29 @@ public interface ByteVectorProvider {
*/
int dimension();
+
uschindler commented on PR #13200:
URL: https://github.com/apache/lucene/pull/13200#issuecomment-2032616166
I haven't checked the old enum, do we really need all the backwards cruft,
if we make the SPI a new feature for Lucene 10? We can remove all old classes
then and just adopt reader cod
uschindler commented on PR #13200:
URL: https://github.com/apache/lucene/pull/13200#issuecomment-2032608314
The SPI interface and naming of vector similarity looks fine from the
FieldInfos and their encoding on field metadata. The code looks copypasted
(including the Holder class) from docv
Pulkitg64 commented on issue #13241:
URL: https://github.com/apache/lucene/issues/13241#issuecomment-2032572309
That's a good call out @benwtrent. But I was wondering, how much size is
considered as big size for tracking purpose?
For example, let's say there are 1 million nodes in upper l
ChrisHegarty commented on code in PR #13200:
URL: https://github.com/apache/lucene/pull/13200#discussion_r1548185076
##
lucene/core/src/java/org/apache/lucene/codecs/ByteVectorProvider.java:
##
@@ -33,6 +34,29 @@ public interface ByteVectorProvider {
*/
int dimension();
benwtrent commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2032358113
@mikemccand oh dang, I haven't been doing that. Thanks for picking up my
slack!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
cinsttool commented on PR #13254:
URL: https://github.com/apache/lucene/pull/13254#issuecomment-2032337728
> > We discovered the above containers inefficiencies by our tool cinst.
>
> Could you share a pointer to this tool? I'm curious how it works... thanks.
Thank you for your
mikemccand commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2032311146
It looks like this awesome change was backported for 9.11.0? I'll add the
milestone. So hard to remember to set the milestones on our issues/PRs...
--
This is an automated message
mikemccand commented on PR #11888:
URL: https://github.com/apache/lucene/pull/11888#issuecomment-2032280982
Oooh this change gave a nice pop (~5.4%, ~915 -> 964 K lookups/sec) to the
primary key lookup nightly benchy:
https://home.apache.org/~mikemccand/lucenebench/PKLookup.html
I'll
mikemccand commented on PR #13254:
URL: https://github.com/apache/lucene/pull/13254#issuecomment-2032261293
> We discovered the above containers inefficiencies by our tool cinst.
Could you share a pointer to this tool? I'm curious how it works... thanks.
--
This is an automated mes
msokolov commented on issue #12627:
URL: https://github.com/apache/lucene/issues/12627#issuecomment-2032257493
I was thinking of a baby step: count the number of nodes that are reachable
and then use that in assertions like
https://github.com/apache/lucene/blob/bf193a712535e416edbc854fb10e7
benwtrent commented on issue #12627:
URL: https://github.com/apache/lucene/issues/12627#issuecomment-2032229702
@msokolov I think adding a "reachable" test to Lucene would be nice. The
main goal of such a test would be ensuring that every node is eventually
reachable on every layer. The tri
msokolov commented on issue #12627:
URL: https://github.com/apache/lucene/issues/12627#issuecomment-2032204931
Oh!, I see we added some tooling in
https://github.com/mikemccand/luceneutil/pull/253 as part of KnnGraphTester.
Maybe we can migrate some of this to lucene's test-framework
--
msokolov commented on issue #12627:
URL: https://github.com/apache/lucene/issues/12627#issuecomment-2032182137
I want to revive this discussion about disconnectedness. I think the
two-pass idea is where we would have to go in order to ensure a connected
graph, and in order to implement that
mikemccand commented on PR #13149:
URL: https://github.com/apache/lucene/pull/13149#issuecomment-2032127501
OK I merged a backported to 9.11.0 -- I think that's safe: we added a new
default method to `IntersectVisitor`.
--
This is an automated message from the Apache Git Service.
To respo
mikemccand merged PR #13149:
URL: https://github.com/apache/lucene/pull/13149
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
benwtrent commented on issue #13241:
URL: https://github.com/apache/lucene/issues/13241#issuecomment-2031971023
I am not sure about this. We eagerly load the `nodesByLevel` on heap for
every field. This means we effectively load the graph on heap. I don't think we
want to remove this.
--
easyice opened a new pull request, #13256:
URL: https://github.com/apache/lucene/pull/13256
### Description
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To u
Pulkitg64 opened a new pull request, #13255:
URL: https://github.com/apache/lucene/pull/13255
### Description
Closes #13241
Remove Accountable interface in ```KnnVectorsReader``` and removed
ramBytesUsed function from wherever KNNVectorsReader class is used/extended.
jpountz merged PR #13254:
URL: https://github.com/apache/lucene/pull/13254
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
kaivalnp commented on code in PR #13202:
URL: https://github.com/apache/lucene/pull/13202#discussion_r1547738734
##
lucene/join/src/java/org/apache/lucene/search/join/DiversifyingChildrenFloatKnnVectorQuery.java:
##
@@ -100,8 +102,15 @@ protected TopDocs exactSearch(LeafReaderCo
kaivalnp commented on PR #13202:
URL: https://github.com/apache/lucene/pull/13202#issuecomment-2031861001
> `TimeLimitingBulkScorer` already optimizes for timeout check frequency
outside of `QueryTimeout` impl
Ahh nice catch! You mean something like:
```java
// counter is an
benwtrent commented on PR #13187:
URL: https://github.com/apache/lucene/pull/13187#issuecomment-2031759399
I am sorry this work has stalled. But I have been iterating on
https://github.com/apache/lucene/pull/13200 for a week now. Its getting to a
palatable place.
--
This is an automated
cinsttool commented on code in PR #13254:
URL: https://github.com/apache/lucene/pull/13254#discussion_r1547605381
##
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##
@@ -165,8 +164,7 @@ public FieldInfos(FieldInfo[] infos) {
valuesTemp.add(byNumberTemp[i
jpountz commented on code in PR #13254:
URL: https://github.com/apache/lucene/pull/13254#discussion_r1547438514
##
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##
@@ -165,8 +164,7 @@ public FieldInfos(FieldInfo[] infos) {
valuesTemp.add(byNumberTemp[i])
cinsttool commented on code in PR #13254:
URL: https://github.com/apache/lucene/pull/13254#discussion_r1547361175
##
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##
@@ -165,8 +164,7 @@ public FieldInfos(FieldInfo[] infos) {
valuesTemp.add(byNumberTemp[i
cinsttool commented on code in PR #13254:
URL: https://github.com/apache/lucene/pull/13254#discussion_r1547314639
##
lucene/core/src/java/org/apache/lucene/codecs/CompetitiveImpactAccumulator.java:
##
@@ -107,26 +107,30 @@ public void copy(CompetitiveImpactAccumulator acc) {
cinsttool commented on code in PR #13254:
URL: https://github.com/apache/lucene/pull/13254#discussion_r1547314335
##
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##
@@ -165,8 +164,7 @@ public FieldInfos(FieldInfo[] infos) {
valuesTemp.add(byNumberTemp[i
jpountz commented on code in PR #13254:
URL: https://github.com/apache/lucene/pull/13254#discussion_r1547286138
##
lucene/core/src/java/org/apache/lucene/codecs/CompetitiveImpactAccumulator.java:
##
@@ -107,26 +107,30 @@ public void copy(CompetitiveImpactAccumulator acc) {
43 matches
Mail list logo