On Tue, Mar 5, 2024 at 4:50 AM Chris Hegarty
wrote:
> It appears that there is no GH activity for 2024! Clearly this is incorrect.
> I’ve yet to track down what’s going on with this. Familiar to anyone here?
>
Last time I looked at this, it appeared it is looking at the incorrect
github
On Thu, Feb 15, 2024 at 9:54 AM Uwe Schindler wrote:
>
> Hi,
>
> My Python knowledge is too limited to fix the build script to allow to test
> the smoker with arbitrary JAVA_HOME dircetories next to the baseline (Java
> 11). With lots of copypaste I can make it run on Java 21 in addition to 17,
I don't understand use of the word corruption, isn't it just a bug in
intersect() that only affects wildcards etc? e.g. its not gonna merge
into new segments or impact written data in any way.
And i don't think we should rushout some bugfix release without any
test for this?
On Sat, Dec 9, 2023
d in practice? I
> was assuming that if there is not a lot of new indexed content and not a lot
> of older documents being deleted, large older segment might never have to be
> merged.
>
>
> On Tue 28 Nov 2023 at 20:53, Robert Muir wrote:
>>
>> I don't think th
I don't think there's any problem with GDPR, and I don't think users
should be running unnecessary "optimize". GDRP just says data should
be erased without "undue" delay. waiting for a merge to nuke the
deleted docs isn't "undue", there is a good reason for it.
On Tue, Nov 28, 2023 at 2:40 PM
at 1:13 PM Robert Muir wrote:
>
> For visual confusing characters we have the option to expose specific
> processing for that, e.g.
> https://unicode-org.github.io/icu-docs/apidoc/dev/icu4j/com/ibm/icu/text/SpoofChecker.html#getSkeleton-java.lang.CharSequence-
>
> Maybe there are u
For visual confusing characters we have the option to expose specific
processing for that, e.g.
https://unicode-org.github.io/icu-docs/apidoc/dev/icu4j/com/ibm/icu/text/SpoofChecker.html#getSkeleton-java.lang.CharSequence-
Maybe there are use-cases for a search engine, e.g. find me documents
with
de sitting on the shelf for years). Run "git blame
lucene/CHANGES.txt" if you think I am crazy. Here's a change I made
nearly two years ago, it just sits on the shelf.
84e4b85b094c lucene/CHANGES.txt (Robert Muir
2021-12-07 21:39:13 -050014) * LUCENE-10010: AutomatonQuery,
CompiledA
On Mon, Nov 6, 2023 at 4:22 AM Chris Hegarty
wrote:
>
> Hi,
>
> Great discussion, I agree with all that you have said. And that we will have
> to deal with the intricacies of the MR-JAR regardless of the outcome here,
> which is doable.
>
> I would very much like to avoid supporting Java 17
ed to spam.
On Sat, Nov 4, 2023 at 8:36 AM Mike Drob wrote:
>
> We all agree on using Java though, and using a specific version, and even the
> style output from gradle tidy. Is that nanny state or community consensus?
>
> On Sat, Nov 4, 2023 at 7:29 AM Robert Muir wrote:
>>
example of a nanny state IMO, trying to dictate what git commands to
use, or what editor to use. Maybe this works for you in your corporate
hellholes, but I think some folks have a bit of a power issue, are
accustomed to dictacting this stuff to their employees and so on, but
this is open-source.
>
> Ooh, thank you Dawid! And it's now merged, so we now have a decent timeout
> protection, so if a bad actor tries to crypto mine or run some distributed
> LLM or whatever, at least the wasted resources are bounded by how long a
> "typical" legitimate run takes, plus generous buffer. So
what will happen on windows?
sorry, could not resist.
On Thu, Oct 19, 2023 at 9:48 AM Michael McCandless
wrote:
>
> Hi Team,
>
> Today, Lucene's Directory abstraction does not allow opening an IndexInput on
> a file until the file is fully written and closed via IndexOutput. We
> enforce
I think running the builds with a timeout is a good thing to do
anyway, for any CI build. I'm sure github actions has some fancy yaml
for that, but you can just do "timeout -k 1m 1h ./gradlew..." instead
of "./gradlew" too.
On Mon, Oct 16, 2023 at 9:58 AM Michael McCandless
wrote:
>
> When a
this leniency when
binding to port 0. nuke it.
On Thu, Aug 31, 2023 at 8:46 AM Robert Muir wrote:
>
> probably a bug in some jvm sockets code that called accept() in its
> default blocking mode, when there wasn't any connection to accept? in
> that case accept() call will just block and wait
probably a bug in some jvm sockets code that called accept() in its
default blocking mode, when there wasn't any connection to accept? in
that case accept() call will just block and wait for someone to make a
new connection.
On Thu, Aug 31, 2023 at 8:16 AM Dawid Weiss wrote:
>
>
>
ads to
>>>> https://pastebin.com/kkggV9Vx
>>>>
>>>> Now, the test vectors in that pastebin do not match either the output of
>>>> pre-change Lucene's murmur3, nor the output of the Python mmh3 package.
>>>> That said, the pre-change Lucen
Consulting | Training | Open Source
>
> Website: Sease.io <http://sease.io/>
> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter
> <https://twitter.com/seaseltd> | Youtube
> <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github
&
he Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Tue, May 16, 2023 at 9:50 PM Robert Muir wrote:
>
>> by the way, i agree with the idea to MOVE THE LIMIT UNCHANGED to the
>> hsnw-specific code.
>>
>> This way, someone can write a
t performs.
On Tue, May 16, 2023 at 8:53 PM Robert Muir wrote:
> Gus, I think i explained myself multiple times on issues and in this
> thread. the performance is unacceptable, everyone knows it, but nobody is
> talking about.
> I don't need to explain myself time and time again
ses. If people hit a hard limit, more of them give up
> and never develop the code that will motivate them to look for
> optimizations.
>
> -Gus
>
> On Tue, May 16, 2023 at 6:04 AM Robert Muir wrote:
>
>> i still feel -1 (veto) on increasing this limit. sending more em
i still feel -1 (veto) on increasing this limit. sending more emails does
not change the technical facts or make the veto go away.
On Tue, May 16, 2023 at 4:50 AM Alessandro Benedetti
wrote:
> Hi all,
> we have finalized all the options proposed by the community and we are
> ready to vote for
I remember the benefits from Terms.intersect being pretty huge. Rather
than simple ping-pong, the whole monster gets handed off directly to
the codec's term dictionary implementation. For the default terms
dictionary using blocktree, this saves time seeking to terms you don't
care about (because
The better solution is to use Terms.intersect. Then the postings
format can do the right thing. But this query doesn't use
Terms.intersect today, instead doing ping-ponging itself.
That's the problem.
We must *not* tune our algorithms for amazon's search but instead what
is the best for users
On Tue, May 2, 2023 at 3:24 PM Michael Froh wrote:
>
> > This seems ok if it isn't invasive. I still feel like something is
> > "off" if you are seeing GC time from 1KB-per-segment allocation. Do
> > you have way too many segments?
>
> From what I saw, it's 1KB per "leaf query" to create the
On Tue, May 2, 2023 at 2:34 PM Robert Muir wrote:
>
> On Tue, May 2, 2023 at 12:49 PM Michael Froh wrote:
> >
> > Hi all,
> >
> > I was looking into a customer issue where they noticed some increased GC
> > time after upgrading from Lucene 7.x to 9.x. After t
On Tue, May 2, 2023 at 12:49 PM Michael Froh wrote:
>
> Hi all,
>
> I was looking into a customer issue where they noticed some increased GC time
> after upgrading from Lucene 7.x to 9.x. After taking some heap dumps from
> both systems, the big difference was tracked down to the float[256]
t; of this thread was a friendly request to please point me to instructions for
> running a broad range of Lucene indexing benchmarks, so I can gather data for
> further discussion; from my perspective, we haven't even gathered any data,
> so obviously we haven't seen an
va that test strings of length
> greater than 8, and my change passes them. Could you explain what you want
> tested?
>
> Cheers,
> Thomas
>
> On Tue, Apr 25, 2023 at 4:21 PM Robert Muir wrote:
>>
>> sure, but "if length > 8 return 1" might pass these same
producing that data.
>
> Cheers,
> Thomas
>
>
>
> On Tue, Apr 25, 2023 at 4:02 PM Robert Muir wrote:
>>
>> well there is some cost, as it must add additional checks to see if
>> its longer than 8. in your patch, additional loops. it increases the
>>
to a crawl.
On Tue, Apr 25, 2023 at 9:56 AM Thomas Dullien
wrote:
>
> Ah, I see what you mean.
>
> You are correct -- the change will not speed up a 5-byte word, but it *will*
> speed up all 8+-byte words, at no cost to the shorter words.
>
> On Tue, Apr 25, 2023 at 3:20
4
> isn't quite enough?
>
> Cheers,
> Thomas
>
> On Tue, Apr 25, 2023 at 3:07 PM Robert Muir wrote:
>>
>> i think from my perspective it has nothing to do with cpus being
>> 32-bit or 64-bit and more to do with the average length of terms in
>> most languages
slower
> indexing on (the dwindling) 32 bit CPUs?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Apr 25, 2023 at 7:39 AM Robert Muir wrote:
>>
>> I think the results of the benchmark will depend on the properties of
>> the indexed ter
I think the results of the benchmark will depend on the properties of
the indexed terms. For english wikipedia (luceneutil) the average word
length is around 5 bytes so this optimization may not do much.
On Tue, Apr 25, 2023 at 1:58 AM Patrick Zhai wrote:
>
> I did a quick run with your patch,
>
> Yes thats true, I just have to add: You can still open a NRT reader
> directly from IndexWriter. But you don't need a sequence number there as
> its hidden completely. So flushing is fine to allow users to get a new
> NRT reader with the state up to that point, but it does not need to
> return
This is not true: if i call IndexWriter.commit, then i can open an
indexreader and see the documents.
IndexWriter.flush doesn't do anything at all, really, just moves stuff
from RAM to disk but not in a way that indexreader can see it or
anything, right?
It doesn't make much sense that this
ne here is saying we
> won't address it, it's just a separate discussion.
>
>
> On Sun, 9 Apr 2023, 12:59 Robert Muir, wrote:
>>
>> Also, please let's only disucss SEARCH. lucene is a SEARCH ENGINE
>> LIBRARY. not a vector database or whatever trash is being proposed
>
with basically anyone on this thread because they are all
stating crazy things that don't make sense.
On Sun, Apr 9, 2023 at 6:25 AM Robert Muir wrote:
>
> Yes, its very clear that folks on this thread are ignoring reason
> entirely and completely swooned by chatgpt-hype.
> And what
mance,
>> > they may contribute improvements.
>> > This is how you make progress.
>> >
>> > If it's a reputation thing, trust me that not allowing users to play with
>> > high dimensional space will equally damage it.
>> >
>> >
gt; Then you complain about people not meeting you half way. Wow
>
> On Sat, Apr 8, 2023, 12:40 PM Robert Muir wrote:
>>
>> On Sat, Apr 8, 2023 at 8:33 AM Michael Wechner
>> wrote:
>> >
>> > What exactly do you consider reasonable?
>>
>> Let'
On Sat, Apr 8, 2023 at 8:33 AM Michael Wechner
wrote:
>
> What exactly do you consider reasonable?
Let's begin a real discussion by being HONEST about the current
status. Please put politically correct or your own company's wishes
aside, we know it's not in a good state.
Current status is the
ve this without prior
>> knowledge of the vectors. Faiss has a nice implementation that fits
>> naturally with Lucene called IVF (
>> https://faiss.ai/cpp_api/struct/structfaiss_1_1IndexIVF.html)
>> but if we want to avoid running kmeans on every merge we d require t
>> > one per cluster. In our case the vectors in each segment could belong to
>> > different cluster so I don’t see how we could merge them efficiently.
>> >
>> > On Fri, 7 Apr 2023 at 22:28, jim ferenczi wrote:
>> >>
>> >> The inference time (an
; Regarding the ram buffer, we could drastically reduce the size by writing
>> the vectors on disk instead of keeping them in the heap. With 1k dimensions
>> the ram buffer is filled with these vectors quite rapidly.
>>
>> On Fri, 7 Apr 2023 at 21:59, Robert Muir wrote:
>&g
On Fri, Apr 7, 2023 at 5:13 PM Benjamin Trent wrote:
>
> From all I have seen when hooking up JFR when indexing a medium number of
> vectors(1M +), almost all the time is spent simply comparing the vectors
> (e.g. dot_product).
>
> This indicates to me that another algorithm won't really help
On Fri, Apr 7, 2023 at 7:47 AM Michael Sokolov wrote:
>
> 8M 1024d float vectors indexed in 1h48m (16G heap, IW buffer size=1994)
> 4M 2048d float vectors indexed in 1h44m (w/ 4G heap, IW buffer size=1994)
>
> Robert, since you're the only on-the-record veto here, does this
> change your thinking
gt; users and internal data structure optimizations, if any.
>
>
> On Wed, 5 Apr 2023, 18:54 Robert Muir, wrote:
>>
>> I'd ask anyone voting +1 to raise this limit to at least try to index
>> a few million vectors with 756 or 1024, which is allowed today.
>>
>
I'd ask anyone voting +1 to raise this limit to at least try to index
a few million vectors with 756 or 1024, which is allowed today.
IMO based on how painful it is, it seems the limit is already too
high, I realize that will sound controversial but please at least try
it out!
voting +1 without
+1 to release, thank you for volunteering to be RM!
I went thru 9.5 section of CHANGES.txt and tagged all the GH issues in
there with milestone too, if they didn't already have it. It looks
even bigger now.
On Fri, Jan 13, 2023 at 4:54 AM Luca Cavanna wrote:
>
> Hi all,
> I'd like to propose
t;> segments etc) and can pick up from there. You would need a mechanism
>> to replay the writes the primary never had a chance to commit.
>>
>> On Fri, Dec 16, 2022 at 5:41 AM Robert Muir wrote:
>> >
>> > You are still talking "Multiple writers".
ode (main indexer) is down, how would we recover with
> a back up indexer?
>
> Thanks
> Patrick
>
>
> On Thu, Dec 15, 2022 at 7:16 PM Robert Muir wrote:
>
> > This multiple-writer isn't going to work and customizing names won't
> > allow it anyway. Each file
This multiple-writer isn't going to work and customizing names won't
allow it anyway. Each file also contains a unique identifier tied to
its commit so that we know everything is intact.
I would look at the segment replication in lucene/replicator and not
try to play games with files and mixing
Hi Gennadiy,
The lucene project has migrated from JIRA to Github Issues for issue tracking.
Please create an issue here: https://github.com/apache/lucene/issues
On Wed, Dec 7, 2022 at 11:23 AM Gennadiy Vaysman
wrote:
>
> Hello, Lucene developers,
>
> My email below to iss...@lucene.apache.org
Multiple spatial tests are failing in jenkins... bisected them to this commit.
Can you please look into it? https://github.com/apache/lucene/issues/11956
On Sat, Nov 19, 2022 at 8:22 PM wrote:
>
> This is an automated email from the ASF dual-hosted git repository.
>
> kwright pushed a commit to
I think he is running this from jenkins job. I suspect agents have
"stacked up" over time take a look with "ps". Every time i run the
smoketester, it "leaks" at least an agent or two.
On Fri, Nov 18, 2022 at 9:48 AM Adrien Grand wrote:
>
> Reading Uwe's error message more carefully, I had
+1
SUCCESS! [1:16:29.706409]
On Thu, Nov 17, 2022 at 9:18 AM Adrien Grand wrote:
>
> Please vote for release candidate 1 for Lucene 9.4.2
>
> The artifacts can be downloaded from:
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.4.2-RC1-rev-858d9b437047a577fa9457089afff43eefa461db
>
>
if your machine is really 12 cores and 64GB ram but is that slow, then
uninstall that windows shit immediately, that's horrible.
On Thu, Nov 17, 2022 at 5:46 AM Karl Wright wrote:
>
> Thanks - the target I was using was the complete "build" target on the whole
> project. This will be a
> I plan on starting the release process tomorrow if there are no objections.
>
> On Fri, Nov 11, 2022 at 4:22 PM Robert Muir wrote:
>>
>> These are the 9.4.2 completed issues:
>>
>> https://github.com/apache/lucene/pull/11905 <-- bug and associated monster
>
x on main and checked that it works with error prone, in
> process compilation and alt javac. But double checking would be probably
> good. :)
>
> Dawid
>
> On Wed, Nov 16, 2022 at 12:18 AM Robert Muir wrote:
>>
>> It is my fault. I will revert my changes and test with &q
te:
> >
> > +1 from me for a bugfix release once we've solidified testing. Thanks to
> > everyone working on improving tests and static analysis -- this now is our
> > second time encountering a bad arithmetic bug and it's important to get
> > ahead of these issues
Friday is also a public holiday here,
> celebrating the end of World War 1. :)
>
> On Wed, Nov 9, 2022 at 4:41 PM Robert Muir wrote:
>>
>> Can we please have a few days to improve the test situation? I think
>> we need to beef up checkindex to exercise seek() on the
Can we please have a few days to improve the test situation? I think
we need to beef up checkindex to exercise seek() on the vectors, also
we need to look at static analysis to try to find other similar bugs.
This would help prevent "whack-a-mole" and improve correctness going forwards.
I want to
I think deferring the advance call like this is fine and harmless,
only because this DoubleValues "caches" the result for the current
doc, so its idempotent anyway.
Yes, about "advancing all the operands" as I mentioned, expressions
has no clue about this. If you wanted to change it, you'd have
Iirc the expressions acts like a simple scripting engine where it just
compiles bytecode for your expression and you are able to bind variables
that you pass to the method... I don't know of an easy way to do this.
On Tue, Oct 25, 2022, 1:13 PM Michael Sokolov wrote:
>
ore)
The slowest suites (exceeding 1s) during this run:
8512.27s TestManyKnnVectors (:lucene:core)
BUILD SUCCESSFUL in 2h 22m 55s
19 actionable tasks: 13 executed, 6 up-to-date
On Thu, Oct 20, 2022 at 3:57 PM Robert Muir wrote:
>
> Thank you Julie for the draft test! I will try to repro
Thank you Julie for the draft test! I will try to reproduce/test with it.
On Thu, Oct 20, 2022 at 3:45 PM Julie Tibshirani wrote:
>
> Thank you Ignacio for taking over as release manager! I ran into some issues
> with my signing key and Ignacio saved the day.
>
> Robert, I understand your
+0 SUCCESS! [0:39:31.979476]
I say +0 instead of +1, because i am still worried that we release
with a bugfix without any test.
I am happy to change vote to a +1 if we even have a hacky test in a
draft PR. the release artifacts don't need to contain such a test or
anything like that. i just want
oco/
On Wed, Oct 5, 2022 at 8:58 AM Patrick Zhai wrote:
>
> Make sense to me, I'll try to look into it!
>
> On Tue, Oct 4, 2022, 16:50 Robert Muir wrote:
>>
>> We already have code coverage integrated into the build. See the
>> documentation on how to generate the report
rwise I can try it a little bit with
> my own repo first and then try to add it to lucene.
>
> Best
> Patrick
>
>
>
> On Tue, Oct 4, 2022, 06:36 Robert Muir wrote:
>>
>> btw, you can look at the current reports created by jenkins here:
>> https://ci-builds.ap
btw, you can look at the current reports created by jenkins here:
https://ci-builds.apache.org/job/Lucene/job/Lucene-Coverage-main/lastBuild/jacoco/
On Tue, Oct 4, 2022 at 6:51 AM Robert Muir wrote:
>
> we can run the tests with coverage option and produce coverage graph
> from t
we can run the tests with coverage option and produce coverage graph
from the github actions, but need to look at the docs to see where to
put it so it will be available.
I want us to be careful about the word "check" as I'm adamantly
against any such automated check (e.g. coverage > N%) in the
Github number).
>
> Uwe
>
> Am 30.09.2022 um 09:51 schrieb Robert Muir:
> > I've seen this failure before here:
> > https://github.com/apache/lucene/issues/11754
> >
> > From what I remember, seems something blows up with the multiplier
> > that causes the usag
I've seen this failure before here:
https://github.com/apache/lucene/issues/11754
>From what I remember, seems something blows up with the multiplier
that causes the usage.
On Fri, Sep 30, 2022 at 3:17 AM Uwe Schindler wrote:
>
> Hi,
>
> I have never seen this before. It looks like something in
the 'gradlew -q javaToolChains' command is useful to see which JVMs
gradle knows about.
On Tue, Sep 27, 2022 at 3:34 PM Uwe Schindler wrote:
>
> You just need to recreate Gradle properties, e.g. by deleting the old file.
>
> If you do not change anything Gradle will just work. On first build it
+1
Smoketester works for me again without hassles, thanks Uwe.
I tested both java 11 and java 17.
SUCCESS! [2:49:13.336252]
P.S. It would be nice option in the future to be able to test other
versions that we have MR-jar'd code for (e.g. 19 in this case).
On Tue, Sep 27, 2022 at 9:15 AM
+1
Ran the smoketester with both java 11 and 17:
SUCCESS! [2:41:19.024193]
On Tue, Sep 20, 2022 at 10:10 PM Michael Sokolov wrote:
>
> Please vote for release candidate 1 for Lucene 9.4.0
>
> The artifacts can be downloaded from:
>
Can also potentially avoid them and reduce the amount of back-n-forth
by pulling from the ultimate URL instead of redirecting around:
https://raw.githubusercontent.com/gradle/gradle/v7.3.3/gradle/wrapper/gradle-wrapper.jar
On Tue, Sep 13, 2022 at 3:20 AM Dawid Weiss wrote:
>
> These 500/503s are
Take a look here for the older ones:
https://cwiki.apache.org/confluence/display/LUCENE/Release+Notes
On one hand you have to deal with confluence, but using the wiki has
the advantage that other ppl can edit it. So you can basically
copy-paste from a previous one as a template and enlist help
thanks for fixing!
On Wed, Aug 31, 2022 at 2:43 PM Michael Sokolov wrote:
>
> Oh -- sorry, I guess I forgot to backport. Thanks for tracking it down
> - I'll push to branch_9x shortly
>
> On Wed, Aug 31, 2022 at 10:25 AM Robert Muir wrote:
> >
> > can we backport to 9
maybe the OOMKiller kicked in.
On Wed, Aug 31, 2022 at 3:06 PM Dawid Weiss wrote:
>
>
> I think Lucene tests killed the job runner. :)
>
> > Task :lucene:analysis:nori:spotlessJavaCheck
> > Task :lucene:analysis:nori:spotlessCheck
> FATAL: command execution failed
> java.io.IOException: Backing
can we backport to 9.x if you get a chance? I'm still seeing this test
trip in 9.x jenkins builds.
On Mon, Aug 29, 2022 at 11:50 AM wrote:
>
> This is an automated email from the ASF dual-hosted git repository.
>
> sokolov pushed a commit to branch main
> in repository
On Thu, Aug 25, 2022 at 9:47 AM Michael Sokolov wrote:
>
> I agree; I've always used CHANGES for a quick historical view. What
> about the release manager use case? I haven't done a release, but I
> think we generally want to know if people are targeting changes for an
> upcoming release,
On Thu, Aug 25, 2022 at 6:11 AM Michael Sokolov wrote:
>
> The milestone looks appealing since it is prominent and relatively easy to
> use. The only drawback I have heard is that it is single valued. It still
> seems we could use it to document the first version in which something is
>
On Wed, Aug 24, 2022 at 11:40 AM Uwe Schindler wrote:
>
> Hi,
>
> this is the MacOS virtualbox. This one often hast timeshifts caused by
> Virtualbox and the NTP daemon of OSX is bullshit (no chrony).
>
> Actually earlier versions of MacOS had a bug in their OS libc
> segfaulting the app to crash
would indeed have to be significant
> for this to fail (and in the middle of the process?!). Anyway, I'll
> look into this - thanks for the pointer!
>
> Dawid
>
> On Wed, Aug 24, 2022 at 1:39 PM Robert Muir wrote:
> >
> > Hi Dawid, I looked at this and also
>
Hi Dawid, I looked at this and also https://github.com/apache/lucene/issues/7687
If you look at the instances and how sporadic they are, the problem
could be caused by TimeoutSuite using wall-clock time in
com.carrotsearch.randomizedtesting? Especially in virtual machines,
wall-clock time can be
On Thu, Aug 18, 2022 at 1:47 PM Alexander Lukyanchikov
wrote:
>
> Currently we are trying to avoid switching to MMAP because there is another
> process running on the same host and extensively utilizes the FS cache.
>
This makes no sense, NIOFSDirectory uses the FS cache the exact same
way as
rge majority of issues.
> Don't we have to gain the consent of each individual to map both accounts?
>
No, we don't have to ask permission to mention someone with an @username
> 2022年6月18日(土) 18:52 Robert Muir :
> >
> > I looked at some related projects on github:
> > h
sue. (thus obfuscating/splitting
>> >> > > > > the very important rich history across systems).
>> >> > > > >
>> >> > > > > So that's why I feel issues should be completely tracked in the
>> >> > > > > system where
On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss wrote:
>
> I'd be more afraid of what happens to github issues in two years (or longer).
> Will it look the same? Will it be different? Will it be gone (and how do we
> get a backup of the isse history then)? Contrary to the apache-hosted Jira,
>
On Fri, Jun 17, 2022 at 12:08 PM Michael McCandless
wrote:
>
> I agree the embedded links are tricky. Not sure whether we could do a big
> rewrite of those links or not ... seems a chicken/egg situation. We could 1)
> append a forwarding link comment on the Jira issue to its GitHub version,
On Tue, Jun 14, 2022 at 10:37 AM Michael Sokolov wrote:
>
> Oh, yes that's a clever idea. It seems it would take quite a while
> (tens of minutes?) for a larger index though? Much faster than the
> force-merge solution for sure. I guess to get faster we would have to
> instrument each format. I
On Mon, Jun 13, 2022 at 3:26 PM Nhat Nguyen
wrote:
>
> Hi Michael,
>
> We developed a similar functionality in Elasticsearch. The DiskUsage API
> estimates the storage of each field by iterating its structures (i.e.,
> inverted index, doc-values, stored fields, etc.) and tracking the number of
+1
On Mon, May 30, 2022 at 11:40 AM Tomoko Uchida
wrote:
>
> Hi everyone!
>
> As we had previous discussion thread [1], I propose migration to GitHub issue
> from Jira.
> It'd be technically possible (see [2] for details) and I think it'd be good
> for the project - not only for welcoming new
On Thu, May 26, 2022 at 11:49 AM Greg Miller wrote:
>
> I agree that technically it's just as good. I also think it's less
> clear for a user. The concept of "points" is something we've
> established in Lucene, so I think it makes sense for users to think
> about indexing points as a doc value as
On Wed, May 25, 2022 at 2:08 PM Greg Miller wrote:
>
>
> I guess with an “unsorted” numeric DV type we could get there with aligned
> indices, as you describe, but that seems less appealing than supporting
> multi-dim points directly.
>
Name one technical reason why?
Unsorted would be exactly
On Wed, May 25, 2022 at 12:17 AM Greg Miller wrote:
>
> A "two separate field approach" would
> consist of indexing year and make separately, and you'd lose the
> information that only certain combinations are valid. Am I overlooking
> something with your suggestion? Maybe there's something we
On Wed, May 25, 2022 at 8:04 AM Michael Sokolov wrote:
>
> Also, there should be examples from other fields. Suppose you are
> indexing map data and want to support a UI that shows "hot spots" on
> the map where there is a lot of let's say ... activity of some sort.
> You'd like to facet on 2-d
This seems really exotic feature to add a dedicated docvalues field for.
We should let BINARY be the catchall for stuff like this.
On Mon, May 23, 2022 at 10:17 PM Marc D'Mello wrote:
>
> Hi,
>
> Some background: I've been working on this PR to add hyper rectangle faceting
> capabilities to
I opened issue about this. It shouldn't block the release, but it is
pretty crazy and something to improve.
https://issues.apache.org/jira/browse/LUCENE-10579
On Wed, May 18, 2022 at 3:10 PM Robert Muir wrote:
>
> It seems strange the way that
> confirmAllReleasesAreTestedForBackCompat
1 - 100 of 17291 matches
Mail list logo