CaptainDredge commented on issue #13079:
URL: https://github.com/apache/lucene/issues/13079#issuecomment-1928949795
cc: @mgodwan, @backslasht
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the s
CaptainDredge opened a new issue, #13079:
URL: https://github.com/apache/lucene/issues/13079
### Description
`numQueuedFlushes()` is a blocking function which gets called in
`DocumentWriter#preUpdate()` to check if there are any queued flushes. If
`checkPendingFlushOnUpdate` is disab
bjhexn commented on issue #13078:
URL: https://github.com/apache/lucene/issues/13078#issuecomment-1928929465
Android gradle
compileOptions {
sourceCompatibility = JavaVersion.VERSION_17
targetCompatibility = JavaVersion.VERSION_17
}
kotlinOptions {
bjhexn opened a new issue, #13078:
URL: https://github.com/apache/lucene/issues/13078
### Description
private fun createIndex(): IndexWriter {
val path = Path(getDir("lunece", 0).path)
val fsDirectory = FSDirectory.open(path)
val analyzer = Standar
AndreyBozhko commented on PR #13077:
URL: https://github.com/apache/lucene/pull/13077#issuecomment-1928773466
Thanks for the review @dungba88 - I added the javadoc as well (tried to
match the style of other javadocs in the file).
--
This is an automated message from the Apache Git Service
dungba88 commented on PR #12831:
URL: https://github.com/apache/lucene/pull/12831#issuecomment-1928655191
Thank you for merging @mikemccand !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the s
risdenk commented on PR #2682:
URL: https://github.com/apache/lucene-solr/pull/2682#issuecomment-1928651515
Sorry for delayed response @HoustonPutman no need to wait for me
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and u
AndreyBozhko opened a new pull request, #13077:
URL: https://github.com/apache/lucene/pull/13077
### Description
Since all the query terms must have the same field, the value is exposed
anyway via
```java
synonymQuery.getTerms().get(0).field()
```
but it's cleaner if o
uschindler commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1928027541
This my final code:
```java
@Override
public int binaryHammingDistance(byte[] a, byte[] b) {
int distance = 0, i = 0;
for (final int upperBound = a.length
uschindler commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927995019
I figured that the `& 0x` is useless. You only need it when widening
into int. Will update my branch and paste code here.
--
This is an automated message from the Apache Git
rmuir commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927946636
Thanks @uschindler , this is the way to go: compiler does a good job. java
already has all the necessary logic here to autovectorize and use e.g.
`vpopcntdq` or AVX2 lookup-table counting
uschindler commented on PR #740:
URL: https://github.com/apache/lucene/pull/740#issuecomment-1927857893
Yes this was intentional. It breaks API.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
uschindler commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927824571
Here's my branch:
https://github.com/apache/lucene/compare/main...uschindler:lucene:binary_hamming_distance
I can merge this into this branch, but the code cleanup and removal o
uschindler commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927790053
I removed the integer tail and have see no difference (especially looked
also at the non-aligned sizes):
```java
@Override
public int binaryHammingDistance(byte[] a, b
mikemccand commented on PR #740:
URL: https://github.com/apache/lucene/pull/740#issuecomment-1927759464
@mocobeta just checking: it looks like this was never backported to 9.x (I
hit unexpected merge conflicts while backporting an FST change) -- was that
intentional? Were there API breaks
rmuir commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927739974
Seems to autovectorize just fine, i took uwe's branch and dumped assembly on
my AVX2 machine and see e.g. 256-bit xor and population count logic. I checked
the logic in openjdk and it will
mikemccand commented on PR #12872:
URL: https://github.com/apache/lucene/pull/12872#issuecomment-1927677782
Thanks @gokaai -- I'll try to review soon!
If possible please try not to force-push: it removes the history of the past
commits and makes it harder to see what changed on this i
uschindler commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927678421
I am not sure if we really need the Integer tail. Mabye only implement the
Long variant and the tail.
--
This is an automated message from the Apache Git Service.
To respond to the
mikemccand commented on code in PR #12872:
URL: https://github.com/apache/lucene/pull/12872#discussion_r1478678480
##
lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java:
##
@@ -389,13 +386,25 @@ private static void parseSegmentInfos(
}
long totalDocs = 0;
dweiss closed issue #13073: java.lang.AssertionError in backward compat tests
("failed to parse ... as date")
URL: https://github.com/apache/lucene/issues/13073
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abov
dweiss merged PR #13075:
URL: https://github.com/apache/lucene/pull/13075
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
dweiss commented on PR #13075:
URL: https://github.com/apache/lucene/pull/13075#issuecomment-1927650998
I'll merge this in so that we can avoid jenkins failures. If there has to be
a follow-up, I'll open another issue.
--
This is an automated message from the Apache Git Service.
To re
uschindler commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927626207
Hi,
I modified the scalar variant like that:
```java
@Override
public int binaryHammingDistance(byte[] a, byte[] b) {
int distance = 0, i = 0;
for (; i < a
uschindler commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927511826
The native order PR was merged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
uschindler merged PR #888:
URL: https://github.com/apache/lucene/pull/888
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
mikemccand commented on code in PR #12345:
URL: https://github.com/apache/lucene/pull/12345#discussion_r1478503121
##
lucene/core/src/java/org/apache/lucene/index/ExitableIndexReader.java:
##
@@ -0,0 +1,539 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or
uschindler commented on PR #13001:
URL: https://github.com/apache/lucene/pull/13001#issuecomment-1927410583
I moved the changes entry to Lucene 10.0, as to me it makes no sense to
apply this to Lucene 9.x which is not used for active development. New code
enters main first and I had trouble
uschindler merged PR #13001:
URL: https://github.com/apache/lucene/pull/13001
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
uschindler closed issue #12946: Can we ban `Thread.sleep`?
URL: https://github.com/apache/lucene/issues/12946
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-
shubhamvishu commented on PR #13001:
URL: https://github.com/apache/lucene/pull/13001#issuecomment-1927343001
@uschindler I have removed the added file. Thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
uschindler commented on PR #888:
URL: https://github.com/apache/lucene/pull/888#issuecomment-1927340132
I forgot about this PR, we should really apply it. #13076 is another
candidate that could make use of this.
--
This is an automated message from the Apache Git Service.
To respond to th
uschindler commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927329824
Hi,
I don't want to discuss about sense/nonsense of this disatance, but the
implementation could been made very simple and then we may not even need to
have a Panama Vector variant
mikemccand merged PR #12831:
URL: https://github.com/apache/lucene/pull/12831
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
mikemccand commented on PR #12831:
URL: https://github.com/apache/lucene/pull/12831#issuecomment-1927303409
This is technically an API break, but `FSTCompiler` is an experimental API
and effectively an internal Lucene datastructure, so I think we can safely
backport to 9.x without deprecati
rmuir commented on PR #12753:
URL: https://github.com/apache/lucene/pull/12753#issuecomment-1927294356
@mikemccand I think it is heading in the right direction. There are a few
more tasks to do here I think though, e.g. we still need to update
`releaseWizard.py` and `smokeTestRelease.py`.
rmuir commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927228480
even if it doesn't autovectorize, i suspect just gathering e.g. 4/8 bytes at
a time with BitUtil varhandle and using single int/long xor + popcount would
perform very well as a baseline.
rmuir commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927213508
I'm confused about the use of lookup table. naively, i'd try to just xor +
popcnt:
https://docs.oracle.com/en/java/javase/21/docs/api/jdk.incubator.vector/jdk/incubator/vector/Vecto
mikemccand commented on code in PR #12547:
URL: https://github.com/apache/lucene/pull/12547#discussion_r1478386039
##
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/FloatTaxonomyFacets.java:
##
@@ -37,33 +37,43 @@ abstract class FloatTaxonomyFacets extends TaxonomyFacets
rmuir commented on code in PR #13076:
URL: https://github.com/apache/lucene/pull/13076#discussion_r1478382295
##
lucene/core/src/java20/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java:
##
@@ -576,4 +578,114 @@ private int squareDistanceBody128(byte[] a, byt
mikemccand commented on PR #12753:
URL: https://github.com/apache/lucene/pull/12753#issuecomment-1927152977
Is this ready to go? Thank you for all the hard work here @ChrisHegarty and
@rmuir!
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
benwtrent commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1927115056
@pmpailis could you also push a `CHANGES.txt` update? It is would be under
`New Features` for `Lucene 9.10.0`
--
This is an automated message from the Apache Git Service.
To respond
benwtrent commented on code in PR #13076:
URL: https://github.com/apache/lucene/pull/13076#discussion_r1478274441
##
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##
@@ -214,4 +214,11 @@ public static float[] checkFinite(float[] v) {
}
return v;
}
+
+
jpountz commented on code in PR #13067:
URL: https://github.com/apache/lucene/pull/13067#discussion_r1478307290
##
lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestIndexSortBackwardsCompatibility.java:
##
@@ -82,14 +84,20 @@ public void testSortedIndexAddDocB
dweiss commented on PR #13075:
URL: https://github.com/apache/lucene/pull/13075#issuecomment-1927005366
I've moved that utility function to LineFileDocs and added a basic test case
to TestLineFileDocs (good idea). I feel tempted to normalize the date field's
value in LineFileDocs but this m
benwtrent commented on code in PR #12962:
URL: https://github.com/apache/lucene/pull/12962#discussion_r1478196742
##
lucene/join/src/java/org/apache/lucene/search/join/DiversifyingChildrenByteKnnVectorQuery.java:
##
@@ -24,15 +24,8 @@
import org.apache.lucene.index.LeafReaderCo
dweiss commented on PR #13075:
URL: https://github.com/apache/lucene/pull/13075#issuecomment-1926946487
> P.S.: Maybe add a quick test for the parsing logic to check that both
formats are accepted.
Maybe it'd be better to move this logic into LineFileDocs so that the date
field's val
hurutoriya commented on PR #12882:
URL: https://github.com/apache/lucene/pull/12882#issuecomment-1926941192
@janhoy Thank you for suggestion. I've never do the QA.
> If not, we need to QA that the script is not broken by this change.
OK, let me try the QA 🙏 .
How should I do
pmpailis opened a new pull request, #13076:
URL: https://github.com/apache/lucene/pull/13076
This PR adds support for binary Hamming distance as a similarity metric for
byte vectors. The drive behind this is that there is an increasing interest in
applying hashing techniques for embeddi
dweiss commented on PR #13075:
URL: https://github.com/apache/lucene/pull/13075#issuecomment-1926938848
> Looks fine. Why did you create a `Function` for
parsing instead of a simple static method? I think you did this to hide the
formatter instances?
Yes, correct.
--
This is an au
uschindler commented on PR #13075:
URL: https://github.com/apache/lucene/pull/13075#issuecomment-1926758404
P.S.: Maybe add a quick test for the parsing logic to check that both
formats are accepted.
--
This is an automated message from the Apache Git Service.
To respond to the message, p
uschindler commented on code in PR #13075:
URL: https://github.com/apache/lucene/pull/13075#discussion_r1478028235
##
lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestIndexSortBackwardsCompatibility.java:
##
@@ -147,6 +150,36 @@ public void testSortedIndex()
dweiss commented on issue #13073:
URL: https://github.com/apache/lucene/issues/13073#issuecomment-1926679038
There was actually just a single assertion that caused the problems. I've
just remove it since it seems to be duplicated anyway with a term (body:the)
that exists in both data sets.
dweiss commented on issue #13073:
URL: https://github.com/apache/lucene/issues/13073#issuecomment-1926670843
@s1monw - some of those assertions added in #13046 only hold for the
built-in europarl. I fixed date parsing but I'm not sure how to deal with the
problem that the line resource can
dweiss commented on PR #13075:
URL: https://github.com/apache/lucene/pull/13075#issuecomment-1926668465
This makes date parsing accept both europarl and enwiki. The tests still
assume the docs come from europarl though and fail on the large
enwiki.random.lines.txt:
```
gradlew -p luce
vsop-479 commented on PR #11888:
URL: https://github.com/apache/lucene/pull/11888#issuecomment-1926589435
@jpountz
Can we push on this change by checking whether our test case has covered all
the status, that `TermsEnum.seekExact` or `TermsEnum.seekCeil` may emit?
--
This is an autom
dweiss commented on issue #13073:
URL: https://github.com/apache/lucene/issues/13073#issuecomment-1926587718
The embedded "linedocsfile" (europarl) has a different date field format
compared to the "large" enwiki used on jenkins.
```
(1) 2004-03-30 Istituzioni europee proteggerl
dweiss opened a new issue, #13073:
URL: https://github.com/apache/lucene/issues/13073
### Description
These failures do reproduce but you need the linedocsfile. I'll take a look.
### Version and environment details
_No response_
--
This is an automated message from the
57 matches
Mail list logo