vsop-479 commented on PR #11888:
URL: https://github.com/apache/lucene/pull/11888#issuecomment-2021791458
> Was this on wikimediumall?
No, this was on `wikimedium10k`.
I will measure the performance again on `wikimediumall`.
--
This is an automated message from the Apache Git
romseygeek commented on PR #13165:
URL: https://github.com/apache/lucene/pull/13165#issuecomment-2021674549
Thanks @vletard! This looks great, I'd just like to add one more test to
ensure that inheritance works in the way we expect.
--
This is an automated message from the Apache Git
romseygeek commented on code in PR #13165:
URL: https://github.com/apache/lucene/pull/13165#discussion_r1540259892
##
lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java:
##
@@ -1130,7 +1137,7 @@ public boolean acceptField(String field) {
antonha commented on code in PR #13149:
URL: https://github.com/apache/lucene/pull/13149#discussion_r1540177437
##
lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java:
##
@@ -222,6 +230,14 @@ public void visit(DocIdSetIterator iterator) throws
IOException {
antonha commented on code in PR #13149:
URL: https://github.com/apache/lucene/pull/13149#discussion_r1540175478
##
lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java:
##
@@ -185,6 +186,13 @@ public void visit(DocIdSetIterator iterator) throws
IOException {
antonha commented on PR #13149:
URL: https://github.com/apache/lucene/pull/13149#issuecomment-2021556140
> Ideally we'd add a query that adds another implementation of an
`IntersectVisitor` to the nightly benchmarks before merging that PR so that we
can see the performance bump?
Yes
kaivalnp commented on code in PR #13202:
URL: https://github.com/apache/lucene/pull/13202#discussion_r1540166440
##
lucene/join/src/java/org/apache/lucene/search/join/DiversifyingChildrenFloatKnnVectorQuery.java:
##
@@ -100,8 +102,15 @@ protected TopDocs
kaivalnp commented on code in PR #13202:
URL: https://github.com/apache/lucene/pull/13202#discussion_r1540162745
##
lucene/core/src/test/org/apache/lucene/search/TestKnnByteVectorQuery.java:
##
@@ -102,14 +103,34 @@ public void testVectorEncodingMismatch() throws
IOException {
jpountz commented on PR #13220:
URL: https://github.com/apache/lucene/pull/13220#issuecomment-2021484078
It makes sense to me to deprecate it, `IndexSearcher#setTimeout` should be
used instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please
msfroh commented on code in PR #13201:
URL: https://github.com/apache/lucene/pull/13201#discussion_r1540109959
##
lucene/core/src/java/org/apache/lucene/search/AbstractMultiTermQueryConstantScoreWrapper.java:
##
@@ -292,7 +292,21 @@ public long cost() {
};
}
-
epotyom commented on code in PR #12862:
URL: https://github.com/apache/lucene/pull/12862#discussion_r1540090770
##
lucene/facet/src/java/org/apache/lucene/facet/Facets.java:
##
@@ -58,6 +58,9 @@ public abstract FacetResult getTopChildren(int topN, String
dim, String... path)
epotyom commented on code in PR #12862:
URL: https://github.com/apache/lucene/pull/12862#discussion_r1540089552
##
lucene/CHANGES.txt:
##
@@ -87,6 +87,10 @@ API Changes
* GITHUB#13146, GITHUB#13148: Remove ByteBufferIndexInput and only use
MemorySegment APIs
for
msfroh commented on code in PR #13201:
URL: https://github.com/apache/lucene/pull/13201#discussion_r1540080444
##
lucene/core/src/java/org/apache/lucene/search/AbstractMultiTermQueryConstantScoreWrapper.java:
##
@@ -292,7 +292,21 @@ public long cost() {
};
}
-
kaivalnp commented on code in PR #13202:
URL: https://github.com/apache/lucene/pull/13202#discussion_r1540033526
##
lucene/core/src/java/org/apache/lucene/search/TimeLimitingKnnCollectorManager.java:
##
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
vigyasharma commented on code in PR #13202:
URL: https://github.com/apache/lucene/pull/13202#discussion_r1539995871
##
lucene/core/src/test/org/apache/lucene/search/TestKnnByteVectorQuery.java:
##
@@ -102,14 +103,34 @@ public void testVectorEncodingMismatch() throws
vigyasharma commented on PR #13220:
URL: https://github.com/apache/lucene/pull/13220#issuecomment-2021304317
As an alternate to deprecating this entirely, we could also change it to
start using `QueryTimeout`, instead of its custom time limiting counters. I'd
like to get more opinions from
benwtrent commented on PR #13197:
URL: https://github.com/apache/lucene/pull/13197#issuecomment-2021292348
I did a bunch of local benchmarking on this. I am adding a parameter to
allow optional compression as the numbers without compressing are compelling
enough on ARM to justify it IMO.
jpountz commented on issue #13218:
URL: https://github.com/apache/lucene/issues/13218#issuecomment-2020945911
FWIW I plan on treating all issues that have their milestone set to 10.0 as
needing discussion if we want to exclude them from 10.0.
--
This is an automated message from the
jpountz commented on PR #13219:
URL: https://github.com/apache/lucene/pull/13219#issuecomment-2020943095
> P.S.: Are we using RANDOM at the moment?
FYI I tried to start switching some files to it at #13222 and discussed some
limitations.
--
This is an automated message from the
jpountz commented on PR #13219:
URL: https://github.com/apache/lucene/pull/13219#issuecomment-2020941392
> The question that I have about this: How to handle merging then?
This is a big question to me too. With reader pooling, if you open a reader
and then it gets included in a
jpountz opened a new pull request, #13222:
URL: https://github.com/apache/lucene/pull/13222
This switches the following files to `IOContext.RANDOM`:
- Stored fields data file.
- Term vectors data file.
- HNSW graph.
- Temporary file storing vectors at merge time that we use
uschindler commented on PR #13219:
URL: https://github.com/apache/lucene/pull/13219#issuecomment-2020886166
> > P.S.: Are we using RANDOM at the moment?
>
> Not yet, we'd need to start using it where it makes sense like we do for
(PRE)LOAD.
>
> > I also found
gf2121 commented on code in PR #13221:
URL: https://github.com/apache/lucene/pull/13221#discussion_r1539558105
##
lucene/core/src/java/org/apache/lucene/index/PointValues.java:
##
@@ -383,25 +383,18 @@ public final long estimatePointCount(IntersectVisitor
visitor) {
}
gf2121 opened a new pull request, #13221:
URL: https://github.com/apache/lucene/pull/13221
This PR proposes a new way to do numeric dynamic pruning with following
changes:
* Instead of complex sampling and estimating point count to judge whether to
build the competitive iterator,
jpountz commented on PR #13219:
URL: https://github.com/apache/lucene/pull/13219#issuecomment-2020818027
> P.S.: Are we using RANDOM at the moment?
Not yet, we'd need to start using it where it makes sense like we do for
(PRE)LOAD.
> I also found
gf2121 closed pull request #13217: New structure for numeric dynamic pruning
URL: https://github.com/apache/lucene/pull/13217
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
uschindler commented on PR #13216:
URL: https://github.com/apache/lucene/pull/13216#issuecomment-2020751491
> > we now use ChecksumIndexInput as the file is fully readonce without
seeking. Because we removed the IOContext from directory's open method (as it
is hardcoded to readOnce), we do
rquesada-tibco commented on code in PR #13201:
URL: https://github.com/apache/lucene/pull/13201#discussion_r1539468242
##
lucene/core/src/java/org/apache/lucene/search/AbstractMultiTermQueryConstantScoreWrapper.java:
##
@@ -292,7 +292,21 @@ public long cost() {
};
}
uschindler commented on code in PR #13219:
URL: https://github.com/apache/lucene/pull/13219#discussion_r1539457487
##
lucene/core/src/java/org/apache/lucene/store/IOContext.java:
##
@@ -54,58 +42,50 @@ public enum Context {
DEFAULT
};
- public static final IOContext
uschindler commented on code in PR #13219:
URL: https://github.com/apache/lucene/pull/13219#discussion_r1539448998
##
lucene/core/src/java21/org/apache/lucene/store/PosixNativeAccess.java:
##
@@ -135,12 +135,12 @@ public void madvise(MemorySegment segment, IOContext
context)
uschindler commented on PR #13219:
URL: https://github.com/apache/lucene/pull/13219#issuecomment-2020687878
P.S.: Are we using RANDOM at the moment?
I also found https://github.com/elastic/elasticsearch/issues/27748, this
person suggests to pass RANDOM for everything.
--
This is
jpountz commented on code in PR #13219:
URL: https://github.com/apache/lucene/pull/13219#discussion_r1539431077
##
lucene/core/src/java/org/apache/lucene/store/IOContext.java:
##
@@ -54,58 +43,74 @@ public enum Context {
DEFAULT
};
- public static final IOContext
uschindler commented on code in PR #13219:
URL: https://github.com/apache/lucene/pull/13219#discussion_r1539407232
##
lucene/core/src/java/org/apache/lucene/store/IOContext.java:
##
@@ -54,58 +43,74 @@ public enum Context {
DEFAULT
};
- public static final IOContext
easyice merged PR #13203:
URL: https://github.com/apache/lucene/pull/13203
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
uschindler commented on code in PR #13219:
URL: https://github.com/apache/lucene/pull/13219#discussion_r1539364622
##
lucene/core/src/java21/org/apache/lucene/store/PosixNativeAccess.java:
##
@@ -137,17 +136,11 @@ public void madvise(MemorySegment segment, IOContext
context)
uschindler commented on code in PR #13219:
URL: https://github.com/apache/lucene/pull/13219#discussion_r1539364622
##
lucene/core/src/java21/org/apache/lucene/store/PosixNativeAccess.java:
##
@@ -137,17 +136,11 @@ public void madvise(MemorySegment segment, IOContext
context)
rquesada-tibco commented on code in PR #13201:
URL: https://github.com/apache/lucene/pull/13201#discussion_r1539388204
##
lucene/core/src/java/org/apache/lucene/search/AbstractMultiTermQueryConstantScoreWrapper.java:
##
@@ -292,7 +292,21 @@ public long cost() {
};
}
rquesada-tibco commented on code in PR #13201:
URL: https://github.com/apache/lucene/pull/13201#discussion_r1539388204
##
lucene/core/src/java/org/apache/lucene/search/AbstractMultiTermQueryConstantScoreWrapper.java:
##
@@ -292,7 +292,21 @@ public long cost() {
};
}
easyice closed pull request #13171: Reduce some unnecessary ArrayUtil#grow calls
URL: https://github.com/apache/lucene/pull/13171
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
gsmiller commented on code in PR #12862:
URL: https://github.com/apache/lucene/pull/12862#discussion_r1539340494
##
lucene/facet/src/java/org/apache/lucene/facet/Facets.java:
##
@@ -58,6 +58,9 @@ public abstract FacetResult getTopChildren(int topN, String
dim, String... path)
jpountz commented on code in PR #13219:
URL: https://github.com/apache/lucene/pull/13219#discussion_r1539310893
##
lucene/codecs/src/java/org/apache/lucene/codecs/blockterms/VariableGapTermsIndexReader.java:
##
@@ -53,7 +54,7 @@ public
jfboeuf commented on code in PR #13206:
URL: https://github.com/apache/lucene/pull/13206#discussion_r1539336230
##
lucene/core/src/java/org/apache/lucene/store/NRTCachingDirectory.java:
##
@@ -73,6 +73,7 @@ public class NRTCachingDirectory extends FilterDirectory
implements
jfboeuf commented on code in PR #13206:
URL: https://github.com/apache/lucene/pull/13206#discussion_r1539337432
##
lucene/core/src/java/org/apache/lucene/store/NRTCachingDirectory.java:
##
@@ -118,7 +119,9 @@ public synchronized void deleteFile(String name) throws
IOException
gsmiller commented on code in PR #12862:
URL: https://github.com/apache/lucene/pull/12862#discussion_r1539336428
##
lucene/facet/src/java/org/apache/lucene/facet/LongValueFacetCounts.java:
##
@@ -568,6 +568,12 @@ public Number getSpecificValue(String dim, String... path)
{
kaivalnp commented on PR #13202:
URL: https://github.com/apache/lucene/pull/13202#issuecomment-2020534763
> Separately, should we deprecate `TimeLimitingCollector` ? It doesn't use
`QueryTimeout` and I don't think we're using it anywhere.
Created #13220 to discuss this
--
This is
kaivalnp opened a new pull request, #13220:
URL: https://github.com/apache/lucene/pull/13220
### Description
Follow-up to
https://github.com/apache/lucene/pull/13202#issuecomment-2016947960
iamsanjay commented on PR #13198:
URL: https://github.com/apache/lucene/pull/13198#issuecomment-2020515624
@dweiss Thanks for the clarification, It does change the seed and hence was
not able to reproduce the failure case. To increase the likelihood I switch to
choosing only from two
jpountz opened a new pull request, #13219:
URL: https://github.com/apache/lucene/pull/13219
This replaces the `load`, `randomAccess` and `readOnce` flags with a
`ReadAdvice` enum, whose values are aligned with the allowed values to
(f|m)advise.
Closes #13211
--
This is an
jpountz commented on PR #13216:
URL: https://github.com/apache/lucene/pull/13216#issuecomment-2020509119
> we now use ChecksumIndexInput as the file is fully readonce without
seeking. Because we removed the IOContext from directory's open method (as it
is hardcoded to readOnce), we do not
mikemccand commented on issue #13218:
URL: https://github.com/apache/lucene/issues/13218#issuecomment-2020482796
Hmm is there some way to mark issues as blocker for release? I want to make
sure we address this for 10.0.0.
--
This is an automated message from the Apache Git Service.
To
mikemccand commented on code in PR #13206:
URL: https://github.com/apache/lucene/pull/13206#discussion_r1539246356
##
lucene/core/src/java/org/apache/lucene/store/NRTCachingDirectory.java:
##
@@ -73,6 +73,7 @@ public class NRTCachingDirectory extends FilterDirectory
implements
mikemccand commented on code in PR #13206:
URL: https://github.com/apache/lucene/pull/13206#discussion_r1539244396
##
lucene/core/src/java/org/apache/lucene/store/NRTCachingDirectory.java:
##
@@ -118,7 +119,9 @@ public synchronized void deleteFile(String name) throws
mikemccand commented on PR #13206:
URL: https://github.com/apache/lucene/pull/13206#issuecomment-2020444316
> > The bigger problem is when the file is still open in an NRTReader and
gets deleted.
>
> Hmm does it actually happen? I thought index files were ref-counted so
that files
mikemccand commented on code in PR #13206:
URL: https://github.com/apache/lucene/pull/13206#discussion_r1539235911
##
lucene/core/src/java/org/apache/lucene/store/NRTCachingDirectory.java:
##
@@ -73,6 +73,7 @@ public class NRTCachingDirectory extends FilterDirectory
implements
gf2121 opened a new pull request, #13217:
URL: https://github.com/apache/lucene/pull/13217
This PR proposes a new way to do numeric dynamic pruning with following
changes:
* Instead of sampling and estimating point count to judge whether build the
competitive iterator, this patch
mikemccand commented on code in PR #11888:
URL: https://github.com/apache/lucene/pull/11888#discussion_r1539192140
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java:
##
@@ -523,7 +526,9 @@ public void scanToSubBlock(long subFP) {
mikemccand commented on PR #11888:
URL: https://github.com/apache/lucene/pull/11888#issuecomment-2020383140
I like this idea! It seems like it'd especially help primary key lookup
against fixed length IDs like UUID?
Hmm, the QPS in the `luceneutil` runs are way too high (1000s of
jpountz commented on code in PR #13206:
URL: https://github.com/apache/lucene/pull/13206#discussion_r1539187790
##
lucene/core/src/java/org/apache/lucene/store/NRTCachingDirectory.java:
##
@@ -73,6 +73,7 @@ public class NRTCachingDirectory extends FilterDirectory
implements
jpountz opened a new pull request, #13216:
URL: https://github.com/apache/lucene/pull/13216
The variable-gaps terms format uses the legacy storage layout of storing
metadata at the end of the index file, and storing the start pointer of the
metadata as the last 8 bytes of the index files
uschindler merged PR #13215:
URL: https://github.com/apache/lucene/pull/13215
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
mikemccand commented on code in PR #13206:
URL: https://github.com/apache/lucene/pull/13206#discussion_r1539130332
##
lucene/core/src/java/org/apache/lucene/store/NRTCachingDirectory.java:
##
@@ -73,6 +73,7 @@ public class NRTCachingDirectory extends FilterDirectory
implements
reschke commented on issue #11537:
URL: https://github.com/apache/lucene/issues/11537#issuecomment-2020222631
FWIW: Apache Jackrabbit Oak is stuck with Lucene 4.7.x for the time being.
That said, backporting the change to the 4.7.2 source seems to be trivial; see
uschindler opened a new pull request, #13215:
URL: https://github.com/apache/lucene/pull/13215
Our `MIGRATE.md` file in the documentation has many Github links now, but
they are not hotlinked like JIRA issue numbers. This adds support for
`GITHUB#xxx` and `GH-xxx` numbers like in
gf2121 commented on PR #13199:
URL: https://github.com/apache/lucene/pull/13199#issuecomment-2020191817
Thanks for review @jpountz !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
gf2121 merged PR #13199:
URL: https://github.com/apache/lucene/pull/13199
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
uschindler merged PR #13205:
URL: https://github.com/apache/lucene/pull/13205
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
uschindler commented on PR #13206:
URL: https://github.com/apache/lucene/pull/13206#issuecomment-2020165033
So I think we are fine then. +1 to merge
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
uschindler commented on PR #13206:
URL: https://github.com/apache/lucene/pull/13206#issuecomment-2020164536
> > The bigger problem is when the file is still open in an NRTReader and
gets deleted.
>
> Hmm does it actually happen? I thought index files were ref-counted so
that files
ChrisHegarty commented on PR #13205:
URL: https://github.com/apache/lucene/pull/13205#issuecomment-2020109140
> Hi @ChrisHegarty,
> can you have another look on the crazy randomAccess flag afetr I merged
main into this. Especially the checks in the record's constructor should be
checked
jpountz commented on code in PR #13199:
URL: https://github.com/apache/lucene/pull/13199#discussion_r1538951789
##
lucene/core/src/java/org/apache/lucene/index/PointValues.java:
##
@@ -375,16 +375,23 @@ private void intersect(IntersectVisitor visitor,
PointTree pointTree)
ChrisHegarty merged PR #13214:
URL: https://github.com/apache/lucene/pull/13214
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
uschindler commented on PR #13196:
URL: https://github.com/apache/lucene/pull/13196#issuecomment-2020051067
Anyways we can open an issue to track what's going on on the JDK (listing
all relevant issue numbers like the above one).
--
This is an automated message from the Apache Git
uschindler commented on PR #13196:
URL: https://github.com/apache/lucene/pull/13196#issuecomment-2020039697
As disussed before, for implementing fadvise for reading/writing files, we
would need to write a full stack of IO layer natively (OutputStream for writing
and FileChannel for
uschindler commented on PR #13196:
URL: https://github.com/apache/lucene/pull/13196#issuecomment-2020036092
Unfortunately fadvise is at moment close to impossible. Reason: we have no
file handle!
Chances are good that we also get a Java-based fadvise some time in the
future (e.g.,
jpountz commented on issue #13194:
URL: https://github.com/apache/lucene/issues/13194#issuecomment-2020035633
Closing this issue as won't fix, I imagine that madvise/fadvise is
considered a superior solution than direct I/O.
--
This is an automated message from the Apache Git Service.
To
jpountz closed issue #13194: Should we fold DirectIODirectory into FSDirectory?
URL: https://github.com/apache/lucene/issues/13194
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
jpountz commented on PR #13196:
URL: https://github.com/apache/lucene/pull/13196#issuecomment-2020031076
@uschindler Should we open a separate issue for adding `fadvise` support to
`NIOFSDirectory`?
--
This is an automated message from the Apache Git Service.
To respond to the message,
ChrisHegarty commented on PR #13196:
URL: https://github.com/apache/lucene/pull/13196#issuecomment-2020022627
I dunno what I was thinking, this is clearly not correct. I opened #13214 to
fix the test. ( apologies for the stupid test issues! )
--
This is an automated message from the
uschindler commented on PR #13196:
URL: https://github.com/apache/lucene/pull/13196#issuecomment-2019973001
> > The test added by @ChrisHegarty sometimes fails on windows: It does not
close the file it opened for random access testing, so the directory can't be
deleted. Will fix this in a
ChrisHegarty commented on PR #13196:
URL: https://github.com/apache/lucene/pull/13196#issuecomment-2019923997
> The test added by @ChrisHegarty sometimes fails on windows: It does not
close the file it opened for random access testing, so the directory can't be
deleted. Will fix this in a
uschindler commented on PR #13196:
URL: https://github.com/apache/lucene/pull/13196#issuecomment-2019919116
I also removed the extra logging included while development from the main
branch. In 9.x the log message was adapted to list both features together with
the sysprop to disable).
--
vsop-479 commented on PR #13183:
URL: https://github.com/apache/lucene/pull/13183#issuecomment-2019860075
You are right @vigyasharma. I will dig into them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
vsop-479 commented on PR #13192:
URL: https://github.com/apache/lucene/pull/13192#issuecomment-2019849096
@jpountz @mikemccand
Please take a look when you get a chance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
vsop-479 commented on PR #13192:
URL: https://github.com/apache/lucene/pull/13192#issuecomment-2019536235
Implemented binary search term non leaf (`allEqual` and `unEqual`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
84 matches
Mail list logo