Re: [Proposal] Remove max number of dimensions for KNN vectors

Alessandro Benedetti Wed, 12 Apr 2023 05:10:24 -0700

My tentative of listing here only a set of proposals to then vote, has
unfortunately failed.


I appreciate the discussion on better benchmarking hnsw but my feeling is
that this discussion is orthogonal to the limit discussion itself, should
we create a separate mail thread/github jira issue for that?

At the moment I see at least three lines of activities as an outcome from
this (maybe too long) discussion:

1) [small task] there's a need from a good amount of people of
increasing/removing the max limit, as an enabler, to get more users to
Lucene and ease adoption for systems Lucene based (Apache Solr,
Elasticsearch, OpenSearch)

2) [medium task] we all want more benchmarks for Lucene vector-based
search, with a good variety of vector dimensions and encodings

3) [big task? ]  some people would like to  improve vector based search
peformance because currently not acceptable, it's not clear when and how

A question I have for point 1, does it really need to be a one way door?
Can't we reduce the max limit in the future if the implementation becomes
coupled with certain dimension sizes?
It's not ideal I agree, but is back-compatibility more important than
pragmatic benefits?
I. E.
Right now there's no implementation coupled with the max limit - > we
remove/increase the limit and get more Users

With Lucene X.Y a clever committer introduces a super nice implementation
improvements that unfortunately limit the max size to K.
Can't we just document it as a breaking change for such release? So at that
point we won't support >K vectors but for a reason?

Do we have similar precedents in Lucene?



On Wed, 12 Apr 2023, 08:36 Michael Wechner, <michael.wech...@wyona.com>
wrote:

> thank you very much for your feedback!
>
> In a previous post (April 7) you wrote you could make availlable the 47K
> ada-002 vectors, which would be great!
>
> Would it make sense to setup a public gitub repo, such that others could
> use or also contribute vectors?
>
> Thanks
>
> Michael Wechner
>
>
> Am 12.04.23 um 04:51 schrieb Kent Fitch:
>
> I only know some characteristics of the openAI ada-002 vectors, although
> they are a very popular as embeddings/text-characterisations as they allow
> more accurate/"human meaningful" semantic search results with fewer
> dimensions than their predecessors - I've evaluated a few different
> embedding models, including some BERT variants, CLIP ViT-L-14 (with 768
> dims, which was quite good), openAI's ada-001 (1024 dims) and babbage-001
> (2048 dims), and ada-002 are qualitatively the best, although that will
> certainly change!
>
> In any case, ada-002 vectors have interesting characteristics that I think
> mean you could confidently create synthetic vectors which would be hard to
> distinguish from "real" vectors.  I found this from looking at 47K ada-002
> vectors generated across a full year (1994) of newspaper articles from the
> Canberra Times and 200K wikipedia articles:
> - there is no discernible/significant correlation between values in any
> pair of dimensions
> - all but 5 of the 1536 dimensions have an almost identical distribution
> of values shown in the central blob on these graphs (that just show a few
> of these 1531 dimensions with clumped values and the 5 "outlier"
> dimensions, but all 1531 non-outlier dims are in there, which makes for
> some easy quantisation from float to byte if you dont want to go the full
> kmeans/clustering/Lloyds-algorithm approach):
>
> https://docs.google.com/spreadsheets/d/1DyyBCbirETZSUAEGcMK__mfbUNzsU_L48V9E0SyJYGg/edit?usp=sharing
>
> https://docs.google.com/spreadsheets/d/1czEAlzYdyKa6xraRLesXjNZvEzlj27TcDGiEFS1-MPs/edit?usp=sharing
>
> https://docs.google.com/spreadsheets/d/1RxTjV7Sj14etCNLk1GB-m44CXJVKdXaFlg2Y6yvj3z4/edit?usp=sharing
> - the variance of the value of each dimension is characteristic:
>
> https://docs.google.com/spreadsheets/d/1w5LnRUXt1cRzI9Qwm07LZ6UfszjMOgPaJot9cOGLHok/edit#gid=472178228
>
> This probably represents something significant about how the ada-002
> embeddings are created, but I think it also means creating "realistic"
> values is possible.  I did not use this information when testing recall &
> performance on Lucene's HNSW implementation on 192m documents, as I
> slightly dithered the values of a "real" set on 47K docs and stored other
> fields in the doc that referenced the "base" document that the dithers were
> made from, and used different dithering magnitudes so that I could test
> recall with different neighbour sizes ("M"), construction-beamwidth and
> search-beamwidths.
>
> best regards
>
> Kent Fitch
>
>
>
>
> On Wed, Apr 12, 2023 at 5:08 AM Michael Wechner <michael.wech...@wyona.com>
> wrote:
>
>> I understand what you mean that it seems to be artificial, but I don't
>> understand why this matters to test performance and scalability of the
>> indexing?
>>
>> Let's assume the limit of Lucene would be 4 instead of 1024 and there
>> are only open source models generating vectors with 4 dimensions, for
>> example
>>
>>
>> 0.02150459587574005,0.11223817616701126,-0.007903356105089188,0.03795722872018814
>>
>>
>> 0.026009393855929375,0.006306684575974941,0.020492585375905037,-0.029064252972602844
>>
>>
>> -0.08239810913801193,-0.01947402022778988,0.03827739879488945,-0.020566290244460106
>>
>>
>> -0.007012288551777601,-0.026665858924388885,0.044495150446891785,-0.038030195981264114
>>
>> and now I concatenate them to vectors with 8 dimensions
>>
>>
>>
>> 0.02150459587574005,0.11223817616701126,-0.007903356105089188,0.03795722872018814,0.026009393855929375,0.006306684575974941,0.020492585375905037,-0.029064252972602844
>>
>>
>> -0.08239810913801193,-0.01947402022778988,0.03827739879488945,-0.020566290244460106,-0.007012288551777601,-0.026665858924388885,0.044495150446891785,-0.038030195981264114
>>
>> and normalize them to length 1.
>>
>> Why should this be any different to a model which is acting like a black
>> box generating vectors with 8 dimensions?
>>
>>
>>
>>
>> Am 11.04.23 um 19:05 schrieb Michael Sokolov:
>> >> What exactly do you consider real vector data? Vector data which is
>> based on texts written by humans?
>> > We have plenty of text; the problem is coming up with a realistic
>> > vector model that requires as many dimensions as people seem to be
>> > demanding. As I said above, after surveying huggingface I couldn't
>> > find any text-based model using more than 768 dimensions. So far we
>> > have some ideas of generating higher-dimensional data by dithering or
>> > concatenating existing data, but it seems artificial.
>> >
>> > On Tue, Apr 11, 2023 at 9:31 AM Michael Wechner
>> > <michael.wech...@wyona.com> wrote:
>> >> What exactly do you consider real vector data? Vector data which is
>> based on texts written by humans?
>> >>
>> >> I am asking, because I recently attended the following presentation by
>> Anastassia Shaitarova (UZH Institute for Computational Linguistics,
>> https://www.cl.uzh.ch/de/people/team/compling/shaitarova.html)
>> >>
>> >> ----
>> >>
>> >> Can we Identify Machine-Generated Text? An Overview of Current
>> Approaches
>> >> by Anastassia Shaitarova (UZH Institute for Computational Linguistics)
>> >>
>> >> The detection of machine-generated text has become increasingly
>> important due to the prevalence of automated content generation and its
>> potential for misuse. In this talk, we will discuss the motivation for
>> automatic detection of generated text. We will present the currently
>> available methods, including feature-based classification as a “first
>> line-of-defense.” We will provide an overview of the detection tools that
>> have been made available so far and discuss their limitations. Finally, we
>> will reflect on some open problems associated with the automatic
>> discrimination of generated texts.
>> >>
>> >> ----
>> >>
>> >> and her conclusion was that it has become basically impossible to
>> differentiate between text generated by humans and text generated by for
>> example ChatGPT.
>> >>
>> >> Whereas others have a slightly different opinion, see for example
>> >>
>> >> https://www.wired.com/story/how-to-spot-generative-ai-text-chatgpt/
>> >>
>> >> But I would argue that real world and synthetic have become close
>> enough that testing performance and scalability of indexing should be
>> possible with synthetic data.
>> >>
>> >> I completely agree that we have to base our discussions and decisions
>> on scientific methods and that we have to make sure that Lucene performs
>> and scales well and that we understand the limits and what is going on
>> under the hood.
>> >>
>> >> Thanks
>> >>
>> >> Michael W
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> Am 11.04.23 um 14:29 schrieb Michael McCandless:
>> >>
>> >> +1 to test on real vector data -- if you test on synthetic data you
>> draw synthetic conclusions.
>> >>
>> >> Can someone post the theoretical performance (CPU and RAM required) of
>> HNSW construction?  Do we know/believe our HNSW implementation has achieved
>> that theoretical big-O performance?  Maybe we have some silly performance
>> bug that's causing it not to?
>> >>
>> >> As I understand it, HNSW makes the tradeoff of costly construction for
>> faster searching, which is typically the right tradeoff for search use
>> cases.  We do this in other parts of the Lucene index too.
>> >>
>> >> Lucene will do a logarithmic number of merges over time, i.e. each doc
>> will be merged O(log(N)) times in its lifetime in the index.  We need to
>> multiply that by the cost of re-building the whole HNSW graph on each
>> merge.  BTW, other things in Lucene, like BKD/dimensional points, also
>> rebuild the whole data structure on each merge, I think?  But, as Rob
>> pointed out, stored fields merging do indeed do some sneaky tricks to avoid
>> excessive block decompress/recompress on each merge.
>> >>
>> >>> As I understand it, vetoes must have technical merit. I'm not sure
>> that this veto rises to "technical merit" on 2 counts:
>> >> Actually I think Robert's veto stands on its technical merit already.
>> Robert's take on technical matters very much resonate with me, even if he
>> is sometimes prickly in how he expresses them ;)
>> >>
>> >> His point is that we, as a dev community, are not paying enough
>> attention to the indexing performance of our KNN algo (HNSW) and
>> implementation, and that it is reckless to increase / remove limits in that
>> state.  It is indeed a one-way door decision and one must confront such
>> decisions with caution, especially for such a widely used base
>> infrastructure as Lucene.  We don't even advertise today in our javadocs
>> that you need XXX heap if you index vectors with dimension Y, fanout X,
>> levels Z, etc.
>> >>
>> >> RAM used during merging is unaffected by dimensionality, but is
>> affected by fanout, because the HNSW graph (not the raw vectors) is memory
>> resident, I think?  Maybe we could move it off-heap and let the OS manage
>> the memory (and still document the RAM requirements)?  Maybe merge RAM
>> costs should be accounted for in IW's RAM buffer accounting?  It is not
>> today, and there are some other things that use non-trivial RAM, e.g. the
>> doc mapping (to compress docid space when deletions are reclaimed).
>> >>
>> >> When we added KNN vector testing to Lucene's nightly benchmarks, the
>> indexing time massively increased -- see annotations DH and DP here:
>> https://home.apache.org/~mikemccand/lucenebench/indexing.html.  Nightly
>> benchmarks now start at 6 PM and don't finish until ~14.5 hours later.  Of
>> course, that is using a single thread for indexing (on a box that has 128
>> cores!) so we produce a deterministic index every night ...
>> >>
>> >> Stepping out (meta) a bit ... this discussion is precisely one of the
>> awesome benefits of the (informed) veto.  It means risky changes to the
>> software, as determined by any single informed developer on the project,
>> can force a healthy discussion about the problem at hand.  Robert is
>> legitimately concerned about a real issue and so we should use our creative
>> energies to characterize our HNSW implementation's performance, document it
>> clearly for users, and uncover ways to improve it.
>> >>
>> >> Mike McCandless
>> >>
>> >> http://blog.mikemccandless.com
>> >>
>> >>
>> >> On Mon, Apr 10, 2023 at 6:41 PM Alessandro Benedetti <
>> a.benede...@sease.io> wrote:
>> >>> I think Gus points are on target.
>> >>>
>> >>> I recommend we move this forward in this way:
>> >>> We stop any discussion and everyone interested proposes an option
>> with a motivation, then we aggregate the options and we create a Vote maybe?
>> >>>
>> >>> I am also on the same page on the fact that a veto should come with a
>> clear and reasonable technical merit, which also in my opinion has not come
>> yet.
>> >>>
>> >>> I also apologise if any of my words sounded harsh or personal
>> attacks, never meant to do so.
>> >>>
>> >>> My proposed option:
>> >>>
>> >>> 1) remove the limit and potentially make it configurable,
>> >>> Motivation:
>> >>> The system administrator can enforce a limit its users need to
>> respect that it's in line with whatever the admin decided to be acceptable
>> for them.
>> >>> Default can stay the current one.
>> >>>
>> >>> That's my favourite at the moment, but I agree that potentially in
>> the future this may need to change, as we may optimise the data structures
>> for certain dimensions. I  am a big fan of Yagni (you aren't going to need
>> it) so I am ok we'll face a different discussion if that happens in the
>> future.
>> >>>
>> >>>
>> >>>
>> >>> On Sun, 9 Apr 2023, 18:46 Gus Heck, <gus.h...@gmail.com> wrote:
>> >>>> What I see so far:
>> >>>>
>> >>>> Much positive support for raising the limit
>> >>>> Slightly less support for removing it or making it configurable
>> >>>> A single veto which argues that a (as yet undefined) performance
>> standard must be met before raising the limit
>> >>>> Hot tempers (various) making this discussion difficult
>> >>>>
>> >>>> As I understand it, vetoes must have technical merit. I'm not sure
>> that this veto rises to "technical merit" on 2 counts:
>> >>>>
>> >>>> No standard for the performance is given so it cannot be technically
>> met. Without hard criteria it's a moving target.
>> >>>> It appears to encode a valuation of the user's time, and that
>> valuation is really up to the user. Some users may consider 2hours useless
>> and not worth it, and others might happily wait 2 hours. This is not a
>> technical decision, it's a business decision regarding the relative value
>> of the time invested vs the value of the result. If I can cure cancer by
>> indexing for a year, that might be worth it... (hyperbole of course).
>> >>>>
>> >>>> Things I would consider to have technical merit that I don't hear:
>> >>>>
>> >>>> Impact on the speed of **other** indexing operations. (devaluation
>> of other functionality)
>> >>>> Actual scenarios that work when the limit is low and fail when the
>> limit is high (new failure on the same data with the limit raised).
>> >>>>
>> >>>> One thing that might or might not have technical merit
>> >>>>
>> >>>> If someone feels there is a lack of documentation of the
>> costs/performance implications of using large vectors, possibly including
>> reproducible benchmarks establishing the scaling behavior (there seems to
>> be disagreement on O(n) vs O(n^2)).
>> >>>>
>> >>>> The users *should* know what they are getting into, but if the cost
>> is worth it to them, they should be able to pay it without forking the
>> project. If this veto causes a fork that's not good.
>> >>>>
>> >>>> On Sun, Apr 9, 2023 at 7:55 AM Michael Sokolov <msoko...@gmail.com>
>> wrote:
>> >>>>> We do have a dataset built from Wikipedia in luceneutil. It comes
>> in 100 and 300 dimensional varieties and can easily enough generate large
>> numbers of vector documents from the articles data. To go higher we could
>> concatenate vectors from that and I believe the performance numbers would
>> be plausible.
>> >>>>>
>> >>>>> On Sun, Apr 9, 2023, 1:32 AM Dawid Weiss <dawid.we...@gmail.com>
>> wrote:
>> >>>>>> Can we set up a branch in which the limit is bumped to 2048, then
>> have
>> >>>>>> a realistic, free data set (wikipedia sample or something) that
>> has,
>> >>>>>> say, 5 million docs and vectors created using public data (glove
>> >>>>>> pre-trained embeddings or the like)? We then could run indexing on
>> the
>> >>>>>> same hardware with 512, 1024 and 2048 and see what the numbers,
>> limits
>> >>>>>> and behavior actually are.
>> >>>>>>
>> >>>>>> I can help in writing this but not until after Easter.
>> >>>>>>
>> >>>>>>
>> >>>>>> Dawid
>> >>>>>>
>> >>>>>> On Sat, Apr 8, 2023 at 11:29 PM Adrien Grand <jpou...@gmail.com>
>> wrote:
>> >>>>>>> As Dawid pointed out earlier on this thread, this is the rule for
>> >>>>>>> Apache projects: a single -1 vote on a code change is a veto and
>> >>>>>>> cannot be overridden. Furthermore, Robert is one of the people on
>> this
>> >>>>>>> project who worked the most on debugging subtle bugs, making
>> Lucene
>> >>>>>>> more robust and improving our test framework, so I'm listening
>> when he
>> >>>>>>> voices quality concerns.
>> >>>>>>>
>> >>>>>>> The argument against removing/raising the limit that resonates
>> with me
>> >>>>>>> the most is that it is a one-way door. As MikeS highlighted
>> earlier on
>> >>>>>>> this thread, implementations may want to take advantage of the
>> fact
>> >>>>>>> that there is a limit at some point too. This is why I don't want
>> to
>> >>>>>>> remove the limit and would prefer a slight increase, such as 2048
>> as
>> >>>>>>> suggested in the original issue, which would enable most of the
>> things
>> >>>>>>> that users who have been asking about raising the limit would
>> like to
>> >>>>>>> do.
>> >>>>>>>
>> >>>>>>> I agree that the merge-time memory usage and slow indexing rate
>> are
>> >>>>>>> not great. But it's still possible to index multi-million vector
>> >>>>>>> datasets with a 4GB heap without hitting OOMEs regardless of the
>> >>>>>>> number of dimensions, and the feedback I'm seeing is that many
>> users
>> >>>>>>> are still interested in indexing multi-million vector datasets
>> despite
>> >>>>>>> the slow indexing rate. I wish we could do better, and vector
>> indexing
>> >>>>>>> is certainly more expert than text indexing, but it still is
>> usable in
>> >>>>>>> my opinion. I understand how giving Lucene more information about
>> >>>>>>> vectors prior to indexing (e.g. clustering information as Jim
>> pointed
>> >>>>>>> out) could help make merging faster and more memory-efficient,
>> but I
>> >>>>>>> would really like to avoid making it a requirement for indexing
>> >>>>>>> vectors as it also makes this feature much harder to use.
>> >>>>>>>
>> >>>>>>> On Sat, Apr 8, 2023 at 9:28 PM Alessandro Benedetti
>> >>>>>>> <a.benede...@sease.io> wrote:
>> >>>>>>>> I am very attentive to listen opinions but I am un-convinced
>> here and I an not sure that a single person opinion should be allowed to be
>> detrimental for such an important project.
>> >>>>>>>>
>> >>>>>>>> The limit as far as I know is literally just raising an
>> exception.
>> >>>>>>>> Removing it won't alter in any way the current performance for
>> users in low dimensional space.
>> >>>>>>>> Removing it will just enable more users to use Lucene.
>> >>>>>>>>
>> >>>>>>>> If new users in certain situations will be unhappy with the
>> performance, they may contribute improvements.
>> >>>>>>>> This is how you make progress.
>> >>>>>>>>
>> >>>>>>>> If it's a reputation thing, trust me that not allowing users to
>> play with high dimensional space will equally damage it.
>> >>>>>>>>
>> >>>>>>>> To me it's really a no brainer.
>> >>>>>>>> Removing the limit and enable people to use high dimensional
>> vectors will take minutes.
>> >>>>>>>> Improving the hnsw implementation can take months.
>> >>>>>>>> Pick one to begin with...
>> >>>>>>>>
>> >>>>>>>> And there's no-one paying me here, no company interest
>> whatsoever, actually I pay people to contribute, I am just convinced it's a
>> good idea.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Sat, 8 Apr 2023, 18:57 Robert Muir, <rcm...@gmail.com> wrote:
>> >>>>>>>>> I disagree with your categorization. I put in plenty of work and
>> >>>>>>>>> experienced plenty of pain myself, writing tests and fighting
>> these
>> >>>>>>>>> issues, after i saw that, two releases in a row, vector
>> indexing fell
>> >>>>>>>>> over and hit integer overflows etc on small datasets:
>> >>>>>>>>>
>> >>>>>>>>> https://github.com/apache/lucene/pull/11905
>> >>>>>>>>>
>> >>>>>>>>> Attacking me isn't helping the situation.
>> >>>>>>>>>
>> >>>>>>>>> PS: when i said the "one guy who wrote the code" I didn't mean
>> it in
>> >>>>>>>>> any kind of demeaning fashion really. I meant to describe the
>> current
>> >>>>>>>>> state of usability with respect to indexing a few million docs
>> with
>> >>>>>>>>> high dimensions. You can scroll up the thread and see that at
>> least
>> >>>>>>>>> one other committer on the project experienced similar pain as
>> me.
>> >>>>>>>>> Then, think about users who aren't committers trying to use the
>> >>>>>>>>> functionality!
>> >>>>>>>>>
>> >>>>>>>>> On Sat, Apr 8, 2023 at 12:51 PM Michael Sokolov <
>> msoko...@gmail.com> wrote:
>> >>>>>>>>>> What you said about increasing dimensions requiring a bigger
>> ram buffer on merge is wrong. That's the point I was trying to make. Your
>> concerns about merge costs are not wrong, but your conclusion that we need
>> to limit dimensions is not justified.
>> >>>>>>>>>>
>> >>>>>>>>>> You complain that hnsw sucks it doesn't scale, but when I show
>> it scales linearly with dimension you just ignore that and complain about
>> something entirely different.
>> >>>>>>>>>>
>> >>>>>>>>>> You demand that people run all kinds of tests to prove you
>> wrong but when they do, you don't listen and you won't put in the work
>> yourself or complain that it's too hard.
>> >>>>>>>>>>
>> >>>>>>>>>> Then you complain about people not meeting you half way. Wow
>> >>>>>>>>>>
>> >>>>>>>>>> On Sat, Apr 8, 2023, 12:40 PM Robert Muir <rcm...@gmail.com>
>> wrote:
>> >>>>>>>>>>> On Sat, Apr 8, 2023 at 8:33 AM Michael Wechner
>> >>>>>>>>>>> <michael.wech...@wyona.com> wrote:
>> >>>>>>>>>>>> What exactly do you consider reasonable?
>> >>>>>>>>>>> Let's begin a real discussion by being HONEST about the
>> current
>> >>>>>>>>>>> status. Please put politically correct or your own company's
>> wishes
>> >>>>>>>>>>> aside, we know it's not in a good state.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Current status is the one guy who wrote the code can set a
>> >>>>>>>>>>> multi-gigabyte ram buffer and index a small dataset with 1024
>> >>>>>>>>>>> dimensions in HOURS (i didn't ask what hardware).
>> >>>>>>>>>>>
>> >>>>>>>>>>> My concerns are everyone else except the one guy, I want it
>> to be
>> >>>>>>>>>>> usable. Increasing dimensions just means even bigger
>> multi-gigabyte
>> >>>>>>>>>>> ram buffer and bigger heap to avoid OOM on merge.
>> >>>>>>>>>>> It is also a permanent backwards compatibility decision, we
>> have to
>> >>>>>>>>>>> support it once we do this and we can't just say "oops" and
>> flip it
>> >>>>>>>>>>> back.
>> >>>>>>>>>>>
>> >>>>>>>>>>> It is unclear to me, if the multi-gigabyte ram buffer is
>> really to
>> >>>>>>>>>>> avoid merges because they are so slow and it would be DAYS
>> otherwise,
>> >>>>>>>>>>> or if its to avoid merges so it doesn't hit OOM.
>> >>>>>>>>>>> Also from personal experience, it takes trial and error (means
>> >>>>>>>>>>> experiencing OOM on merge!!!) before you get those heap
>> values correct
>> >>>>>>>>>>> for your dataset. This usually means starting over which is
>> >>>>>>>>>>> frustrating and wastes more time.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Jim mentioned some ideas about the memory usage in
>> IndexWriter, seems
>> >>>>>>>>>>> to me like its a good idea. maybe the multigigabyte ram
>> buffer can be
>> >>>>>>>>>>> avoided in this way and performance improved by writing bigger
>> >>>>>>>>>>> segments with lucene's defaults. But this doesn't mean we can
>> simply
>> >>>>>>>>>>> ignore the horrors of what happens on merge. merging needs to
>> scale so
>> >>>>>>>>>>> that indexing really scales.
>> >>>>>>>>>>>
>> >>>>>>>>>>> At least it shouldnt spike RAM on trivial data amounts and
>> cause OOM,
>> >>>>>>>>>>> and definitely it shouldnt burn hours and hours of CPU in
>> O(n^2)
>> >>>>>>>>>>> fashion when indexing.
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> >>>>>>>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> >>>>>>>>>>>
>> >>>>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> >>>>>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Adrien
>> >>>>>>>
>> >>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> >>>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> >>>>>>>
>> >>>>>>
>> ---------------------------------------------------------------------
>> >>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> >>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> >>>>>>
>> >>>>
>> >>>> --
>> >>>> http://www.needhamsoftware.com (work)
>> >>>> http://www.the111shift.com (play)
>> >>
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

Re: [Proposal] Remove max number of dimensions for KNN vectors

Reply via email to