Re: [Proposal] Remove max number of dimensions for KNN vectors

Michael Wechner Fri, 21 Apr 2023 00:31:00 -0700

yes, they are, whereas it should help us to test performance andscalability :-)


Am 21.04.23 um 09:24 schrieb Ishan Chattopadhyaya:

Seems like they were all 768 dimensions.

On Fri, 21 Apr, 2023, 11:48 am Michael Wechner,<michael.wech...@wyona.com> wrote:


    Hi Together

    Cohere just published approx. 100Mio embeddings based on Wikipedia
    content

    https://txt.cohere.com/embedding-archives-wikipedia/

    resp.

    https://huggingface.co/datasets/Cohere/wikipedia-22-12-en-embeddings
    https://huggingface.co/datasets/Cohere/wikipedia-22-12-de-embeddings
    ....

    HTH

    Michael



    Am 13.04.23 um 07:58 schrieb Michael Wechner:

    Hi Kent

    Great, thank you very much!

    Will download it later today :-)

    All the best

    Michael

    Am 13.04.23 um 01:35 schrieb Kent Fitch:

    Hi Michael (and anyone else who wants just over 240K "real
    world" ada-002 vectors of dimension 1536),
    you are welcome to retrieve a tar.gz file which contains:
    - 47K embeddings of Canberra Times news article text from 1994
    - 38K embeddings of the first paragraphs of wikipedia articles
    about organisations
    - 156.6K embeddings of the first paragraphs of wikipedia
    articles about people

    
https://drive.google.com/file/d/13JP_5u7E8oZO6vRg0ekaTgBDQOaj-W00/view?usp=sharing

    The file is about 1.7GB and will expand to about 4.4GB. This
    file will be accessible for at least a week, and I hope you dont
    hit any google drive download limits trying to retrieve it.

    The embeddings were generated using my openAI account and you
    are welcome to use them for any purpose you like.

    best wishes,

    Kent Fitch

    On Wed, Apr 12, 2023 at 4:37 PM Michael Wechner
    <michael.wech...@wyona.com> wrote:

        thank you very much for your feedback!

        In a previous post (April 7) you wrote you could make
        availlable the 47K ada-002 vectors, which would be great!

        Would it make sense to setup a public gitub repo, such that
        others could use or also contribute vectors?

        Thanks

        Michael Wechner


        Am 12.04.23 um 04:51 schrieb Kent Fitch:

        I only know some characteristics of the openAI ada-002
        vectors, although they are a very popular as
        embeddings/text-characterisations as they allow more
        accurate/"human meaningful" semantic search results with
        fewer dimensions than their predecessors - I've evaluated a
        few different embedding models, including some BERT
        variants, CLIP ViT-L-14 (with 768 dims, which was quite
        good), openAI's ada-001 (1024 dims) and babbage-001 (2048
        dims), and ada-002 are qualitatively the best, although
        that will certainly change!

        In any case, ada-002 vectors have interesting
        characteristics that I think mean you could confidently
        create synthetic vectors which would be hard to distinguish
        from "real" vectors.  I found this from looking at 47K
        ada-002 vectors generated across a full year (1994) of
        newspaper articles from the Canberra Times and 200K
        wikipedia articles:
        - there is no discernible/significant correlation between
        values in any pair of dimensions
        - all but 5 of the 1536 dimensions have an almost identical
        distribution of values shown in the central blob on these
        graphs (that just show a few of these 1531 dimensions with
        clumped values and the 5 "outlier" dimensions, but all 1531
        non-outlier dims are in there, which makes for some easy
        quantisation from float to byte if you dont want to go the
        full kmeans/clustering/Lloyds-algorithm approach):
        
https://docs.google.com/spreadsheets/d/1DyyBCbirETZSUAEGcMK__mfbUNzsU_L48V9E0SyJYGg/edit?usp=sharing
        
https://docs.google.com/spreadsheets/d/1czEAlzYdyKa6xraRLesXjNZvEzlj27TcDGiEFS1-MPs/edit?usp=sharing
        
https://docs.google.com/spreadsheets/d/1RxTjV7Sj14etCNLk1GB-m44CXJVKdXaFlg2Y6yvj3z4/edit?usp=sharing
        - the variance of the value of each dimension is
        characteristic:
        
https://docs.google.com/spreadsheets/d/1w5LnRUXt1cRzI9Qwm07LZ6UfszjMOgPaJot9cOGLHok/edit#gid=472178228

        This probably represents something significant about how
        the ada-002 embeddings are created, but I think it also
        means creating "realistic" values is possible.  I did not
        use this information when testing recall & performance on
        Lucene's HNSW implementation on 192m documents, as I
        slightly dithered the values of a "real" set on 47K docs
        and stored other fields in the doc that referenced the
        "base" document that the dithers were made from, and used
        different dithering magnitudes so that I could test recall
        with different neighbour sizes ("M"),
        construction-beamwidth and search-beamwidths.

        best regards

        Kent Fitch




        On Wed, Apr 12, 2023 at 5:08 AM Michael Wechner
        <michael.wech...@wyona.com> wrote:

            I understand what you mean that it seems to be
            artificial, but I don't
            understand why this matters to test performance and
            scalability of the
            indexing?

            Let's assume the limit of Lucene would be 4 instead of
            1024 and there
            are only open source models generating vectors with 4
            dimensions, for
            example

            
0.02150459587574005,0.11223817616701126,-0.007903356105089188,0.03795722872018814

            
0.026009393855929375,0.006306684575974941,0.020492585375905037,-0.029064252972602844

            
-0.08239810913801193,-0.01947402022778988,0.03827739879488945,-0.020566290244460106

            
-0.007012288551777601,-0.026665858924388885,0.044495150446891785,-0.038030195981264114

            and now I concatenate them to vectors with 8 dimensions


            
0.02150459587574005,0.11223817616701126,-0.007903356105089188,0.03795722872018814,0.026009393855929375,0.006306684575974941,0.020492585375905037,-0.029064252972602844

            
-0.08239810913801193,-0.01947402022778988,0.03827739879488945,-0.020566290244460106,-0.007012288551777601,-0.026665858924388885,0.044495150446891785,-0.038030195981264114

            and normalize them to length 1.

            Why should this be any different to a model which is
            acting like a black
            box generating vectors with 8 dimensions?




            Am 11.04.23 um 19:05 schrieb Michael Sokolov:
            >> What exactly do you consider real vector data?
            Vector data which is based on texts written by humans?
            > We have plenty of text; the problem is coming up with
            a realistic
            > vector model that requires as many dimensions as
            people seem to be
            > demanding. As I said above, after surveying
            huggingface I couldn't
            > find any text-based model using more than 768
            dimensions. So far we
            > have some ideas of generating higher-dimensional data
            by dithering or
            > concatenating existing data, but it seems artificial.
            >
            > On Tue, Apr 11, 2023 at 9:31 AM Michael Wechner
            > <michael.wech...@wyona.com> wrote:
            >> What exactly do you consider real vector data?
            Vector data which is based on texts written by humans?
            >>
            >> I am asking, because I recently attended the
            following presentation by Anastassia Shaitarova (UZH
            Institute for Computational Linguistics,
            https://www.cl.uzh.ch/de/people/team/compling/shaitarova.html)
            >>
            >> ----
            >>
            >> Can we Identify Machine-Generated Text? An Overview
            of Current Approaches
            >> by Anastassia Shaitarova (UZH Institute for
            Computational Linguistics)
            >>
            >> The detection of machine-generated text has become
            increasingly important due to the prevalence of
            automated content generation and its potential for
            misuse. In this talk, we will discuss the motivation
            for automatic detection of generated text. We will
            present the currently available methods, including
            feature-based classification as a “first
            line-of-defense.” We will provide an overview of the
            detection tools that have been made available so far
            and discuss their limitations. Finally, we will reflect
            on some open problems associated with the automatic
            discrimination of generated texts.
            >>
            >> ----
            >>
            >> and her conclusion was that it has become basically
            impossible to differentiate between text generated by
            humans and text generated by for example ChatGPT.
            >>
            >> Whereas others have a slightly different opinion,
            see for example
            >>
            >>
            https://www.wired.com/story/how-to-spot-generative-ai-text-chatgpt/
            >>
            >> But I would argue that real world and synthetic have
            become close enough that testing performance and
            scalability of indexing should be possible with
            synthetic data.
            >>
            >> I completely agree that we have to base our
            discussions and decisions on scientific methods and
            that we have to make sure that Lucene performs and
            scales well and that we understand the limits and what
            is going on under the hood.
            >>
            >> Thanks
            >>
            >> Michael W
            >>
            >>
            >>
            >>
            >>
            >> Am 11.04.23 um 14:29 schrieb Michael McCandless:
            >>
            >> +1 to test on real vector data -- if you test on
            synthetic data you draw synthetic conclusions.
            >>
            >> Can someone post the theoretical performance (CPU
            and RAM required) of HNSW construction?  Do we
            know/believe our HNSW implementation has achieved that
            theoretical big-O performance?  Maybe we have some
            silly performance bug that's causing it not to?
            >>
            >> As I understand it, HNSW makes the tradeoff of
            costly construction for faster searching, which is
            typically the right tradeoff for search use cases.  We
            do this in other parts of the Lucene index too.
            >>
            >> Lucene will do a logarithmic number of merges over
            time, i.e. each doc will be merged O(log(N)) times in
            its lifetime in the index.  We need to multiply that by
            the cost of re-building the whole HNSW graph on each
            merge.  BTW, other things in Lucene, like
            BKD/dimensional points, also rebuild the whole data
            structure on each merge, I think?  But, as Rob pointed
            out, stored fields merging do indeed do some sneaky
            tricks to avoid excessive block decompress/recompress
            on each merge.
            >>
            >>> As I understand it, vetoes must have technical
            merit. I'm not sure that this veto rises to "technical
            merit" on 2 counts:
            >> Actually I think Robert's veto stands on its
            technical merit already. Robert's take on technical
            matters very much resonate with me, even if he is
            sometimes prickly in how he expresses them ;)
            >>
            >> His point is that we, as a dev community, are not
            paying enough attention to the indexing performance of
            our KNN algo (HNSW) and implementation, and that it is
            reckless to increase / remove limits in that state.  It
            is indeed a one-way door decision and one must confront
            such decisions with caution, especially for such a
            widely used base infrastructure as Lucene.  We don't
            even advertise today in our javadocs that you need XXX
            heap if you index vectors with dimension Y, fanout X,
            levels Z, etc.
            >>
            >> RAM used during merging is unaffected by
            dimensionality, but is affected by fanout, because the
            HNSW graph (not the raw vectors) is memory resident, I
            think?  Maybe we could move it off-heap and let the OS
            manage the memory (and still document the RAM
            requirements)?  Maybe merge RAM costs should be
            accounted for in IW's RAM buffer accounting?  It is not
            today, and there are some other things that use
            non-trivial RAM, e.g. the doc mapping (to compress
            docid space when deletions are reclaimed).
            >>
            >> When we added KNN vector testing to Lucene's nightly
            benchmarks, the indexing time massively increased --
            see annotations DH and DP here:
            https://home.apache.org/~mikemccand/lucenebench/indexing.html.
            Nightly benchmarks now start at 6 PM and don't finish
            until ~14.5 hours later.  Of course, that is using a
            single thread for indexing (on a box that has 128
            cores!) so we produce a deterministic index every night ...
            >>
            >> Stepping out (meta) a bit ... this discussion is
            precisely one of the awesome benefits of the (informed)
            veto.  It means risky changes to the software, as
            determined by any single informed developer on the
            project, can force a healthy discussion about the
            problem at hand.  Robert is legitimately concerned
            about a real issue and so we should use our creative
            energies to characterize our HNSW implementation's
            performance, document it clearly for users, and uncover
            ways to improve it.
            >>
            >> Mike McCandless
            >>
            >> http://blog.mikemccandless.com
            >>
            >>
            >> On Mon, Apr 10, 2023 at 6:41 PM Alessandro Benedetti
            <a.benede...@sease.io> wrote:
            >>> I think Gus points are on target.
            >>>
            >>> I recommend we move this forward in this way:
            >>> We stop any discussion and everyone interested
            proposes an option with a motivation, then we aggregate
            the options and we create a Vote maybe?
            >>>
            >>> I am also on the same page on the fact that a veto
            should come with a clear and reasonable technical
            merit, which also in my opinion has not come yet.
            >>>
            >>> I also apologise if any of my words sounded harsh
            or personal attacks, never meant to do so.
            >>>
            >>> My proposed option:
            >>>
            >>> 1) remove the limit and potentially make it
            configurable,
            >>> Motivation:
            >>> The system administrator can enforce a limit its
            users need to respect that it's in line with whatever
            the admin decided to be acceptable for them.
            >>> Default can stay the current one.
            >>>
            >>> That's my favourite at the moment, but I agree that
            potentially in the future this may need to change, as
            we may optimise the data structures for certain
            dimensions. I  am a big fan of Yagni (you aren't going
            to need it) so I am ok we'll face a different
            discussion if that happens in the future.
            >>>
            >>>
            >>>
            >>> On Sun, 9 Apr 2023, 18:46 Gus Heck,
            <gus.h...@gmail.com> wrote:
            >>>> What I see so far:
            >>>>
            >>>> Much positive support for raising the limit
            >>>> Slightly less support for removing it or making it
            configurable
            >>>> A single veto which argues that a (as yet
            undefined) performance standard must be met before
            raising the limit
            >>>> Hot tempers (various) making this discussion difficult
            >>>>
            >>>> As I understand it, vetoes must have technical
            merit. I'm not sure that this veto rises to "technical
            merit" on 2 counts:
            >>>>
            >>>> No standard for the performance is given so it
            cannot be technically met. Without hard criteria it's a
            moving target.
            >>>> It appears to encode a valuation of the user's
            time, and that valuation is really up to the user. Some
            users may consider 2hours useless and not worth it, and
            others might happily wait 2 hours. This is not a
            technical decision, it's a business decision regarding
            the relative value of the time invested vs the value of
            the result. If I can cure cancer by indexing for a
            year, that might be worth it... (hyperbole of course).
            >>>>
            >>>> Things I would consider to have technical merit
            that I don't hear:
            >>>>
            >>>> Impact on the speed of **other** indexing
            operations. (devaluation of other functionality)
            >>>> Actual scenarios that work when the limit is low
            and fail when the limit is high (new failure on the
            same data with the limit raised).
            >>>>
            >>>> One thing that might or might not have technical merit
            >>>>
            >>>> If someone feels there is a lack of documentation
            of the costs/performance implications of using large
            vectors, possibly including reproducible benchmarks
            establishing the scaling behavior (there seems to be
            disagreement on O(n) vs O(n^2)).
            >>>>
            >>>> The users *should* know what they are getting
            into, but if the cost is worth it to them, they should
            be able to pay it without forking the project. If this
            veto causes a fork that's not good.
            >>>>
            >>>> On Sun, Apr 9, 2023 at 7:55 AM Michael Sokolov
            <msoko...@gmail.com> wrote:
            >>>>> We do have a dataset built from Wikipedia in
            luceneutil. It comes in 100 and 300 dimensional
            varieties and can easily enough generate large numbers
            of vector documents from the articles data. To go
            higher we could concatenate vectors from that and I
            believe the performance numbers would be plausible.
            >>>>>
            >>>>> On Sun, Apr 9, 2023, 1:32 AM Dawid Weiss
            <dawid.we...@gmail.com> wrote:
            >>>>>> Can we set up a branch in which the limit is
            bumped to 2048, then have
            >>>>>> a realistic, free data set (wikipedia sample or
            something) that has,
            >>>>>> say, 5 million docs and vectors created using
            public data (glove
            >>>>>> pre-trained embeddings or the like)? We then
            could run indexing on the
            >>>>>> same hardware with 512, 1024 and 2048 and see
            what the numbers, limits
            >>>>>> and behavior actually are.
            >>>>>>
            >>>>>> I can help in writing this but not until after
            Easter.
            >>>>>>
            >>>>>>
            >>>>>> Dawid
            >>>>>>
            >>>>>> On Sat, Apr 8, 2023 at 11:29 PM Adrien Grand
            <jpou...@gmail.com> wrote:
            >>>>>>> As Dawid pointed out earlier on this thread,
            this is the rule for
            >>>>>>> Apache projects: a single -1 vote on a code
            change is a veto and
            >>>>>>> cannot be overridden. Furthermore, Robert is
            one of the people on this
            >>>>>>> project who worked the most on debugging subtle
            bugs, making Lucene
            >>>>>>> more robust and improving our test framework,
            so I'm listening when he
            >>>>>>> voices quality concerns.
            >>>>>>>
            >>>>>>> The argument against removing/raising the limit
            that resonates with me
            >>>>>>> the most is that it is a one-way door. As MikeS
            highlighted earlier on
            >>>>>>> this thread, implementations may want to take
            advantage of the fact
            >>>>>>> that there is a limit at some point too. This
            is why I don't want to
            >>>>>>> remove the limit and would prefer a slight
            increase, such as 2048 as
            >>>>>>> suggested in the original issue, which would
            enable most of the things
            >>>>>>> that users who have been asking about raising
            the limit would like to
            >>>>>>> do.
            >>>>>>>
            >>>>>>> I agree that the merge-time memory usage and
            slow indexing rate are
            >>>>>>> not great. But it's still possible to index
            multi-million vector
            >>>>>>> datasets with a 4GB heap without hitting OOMEs
            regardless of the
            >>>>>>> number of dimensions, and the feedback I'm
            seeing is that many users
            >>>>>>> are still interested in indexing multi-million
            vector datasets despite
            >>>>>>> the slow indexing rate. I wish we could do
            better, and vector indexing
            >>>>>>> is certainly more expert than text indexing,
            but it still is usable in
            >>>>>>> my opinion. I understand how giving Lucene more
            information about
            >>>>>>> vectors prior to indexing (e.g. clustering
            information as Jim pointed
            >>>>>>> out) could help make merging faster and more
            memory-efficient, but I
            >>>>>>> would really like to avoid making it a
            requirement for indexing
            >>>>>>> vectors as it also makes this feature much
            harder to use.
            >>>>>>>
            >>>>>>> On Sat, Apr 8, 2023 at 9:28 PM Alessandro Benedetti
            >>>>>>> <a.benede...@sease.io> wrote:
            >>>>>>>> I am very attentive to listen opinions but I
            am un-convinced here and I an not sure that a single
            person opinion should be allowed to be detrimental for
            such an important project.
            >>>>>>>>
            >>>>>>>> The limit as far as I know is literally just
            raising an exception.
            >>>>>>>> Removing it won't alter in any way the current
            performance for users in low dimensional space.
            >>>>>>>> Removing it will just enable more users to use
            Lucene.
            >>>>>>>>
            >>>>>>>> If new users in certain situations will be
            unhappy with the performance, they may contribute
            improvements.
            >>>>>>>> This is how you make progress.
            >>>>>>>>
            >>>>>>>> If it's a reputation thing, trust me that not
            allowing users to play with high dimensional space will
            equally damage it.
            >>>>>>>>
            >>>>>>>> To me it's really a no brainer.
            >>>>>>>> Removing the limit and enable people to use
            high dimensional vectors will take minutes.
            >>>>>>>> Improving the hnsw implementation can take months.
            >>>>>>>> Pick one to begin with...
            >>>>>>>>
            >>>>>>>> And there's no-one paying me here, no company
            interest whatsoever, actually I pay people to
            contribute, I am just convinced it's a good idea.
            >>>>>>>>
            >>>>>>>>
            >>>>>>>> On Sat, 8 Apr 2023, 18:57 Robert Muir,
            <rcm...@gmail.com> wrote:
            >>>>>>>>> I disagree with your categorization. I put in
            plenty of work and
            >>>>>>>>> experienced plenty of pain myself, writing
            tests and fighting these
            >>>>>>>>> issues, after i saw that, two releases in a
            row, vector indexing fell
            >>>>>>>>> over and hit integer overflows etc on small
            datasets:
            >>>>>>>>>
            >>>>>>>>> https://github.com/apache/lucene/pull/11905
            >>>>>>>>>
            >>>>>>>>> Attacking me isn't helping the situation.
            >>>>>>>>>
            >>>>>>>>> PS: when i said the "one guy who wrote the
            code" I didn't mean it in
            >>>>>>>>> any kind of demeaning fashion really. I meant
            to describe the current
            >>>>>>>>> state of usability with respect to indexing a
            few million docs with
            >>>>>>>>> high dimensions. You can scroll up the thread
            and see that at least
            >>>>>>>>> one other committer on the project
            experienced similar pain as me.
            >>>>>>>>> Then, think about users who aren't committers
            trying to use the
            >>>>>>>>> functionality!
            >>>>>>>>>
            >>>>>>>>> On Sat, Apr 8, 2023 at 12:51 PM Michael
            Sokolov <msoko...@gmail.com> wrote:
            >>>>>>>>>> What you said about increasing dimensions
            requiring a bigger ram buffer on merge is wrong. That's
            the point I was trying to make. Your concerns about
            merge costs are not wrong, but your conclusion that we
            need to limit dimensions is not justified.
            >>>>>>>>>>
            >>>>>>>>>> You complain that hnsw sucks it doesn't
            scale, but when I show it scales linearly with
            dimension you just ignore that and complain about
            something entirely different.
            >>>>>>>>>>
            >>>>>>>>>> You demand that people run all kinds of
            tests to prove you wrong but when they do, you don't
            listen and you won't put in the work yourself or
            complain that it's too hard.
            >>>>>>>>>>
            >>>>>>>>>> Then you complain about people not meeting
            you half way. Wow
            >>>>>>>>>>
            >>>>>>>>>> On Sat, Apr 8, 2023, 12:40 PM Robert Muir
            <rcm...@gmail.com> wrote:
            >>>>>>>>>>> On Sat, Apr 8, 2023 at 8:33 AM Michael Wechner
            >>>>>>>>>>> <michael.wech...@wyona.com> wrote:
            >>>>>>>>>>>> What exactly do you consider reasonable?
            >>>>>>>>>>> Let's begin a real discussion by being
            HONEST about the current
            >>>>>>>>>>> status. Please put politically correct or
            your own company's wishes
            >>>>>>>>>>> aside, we know it's not in a good state.
            >>>>>>>>>>>
            >>>>>>>>>>> Current status is the one guy who wrote the
            code can set a
            >>>>>>>>>>> multi-gigabyte ram buffer and index a small
            dataset with 1024
            >>>>>>>>>>> dimensions in HOURS (i didn't ask what
            hardware).
            >>>>>>>>>>>
            >>>>>>>>>>> My concerns are everyone else except the
            one guy, I want it to be
            >>>>>>>>>>> usable. Increasing dimensions just means
            even bigger multi-gigabyte
            >>>>>>>>>>> ram buffer and bigger heap to avoid OOM on
            merge.
            >>>>>>>>>>> It is also a permanent backwards
            compatibility decision, we have to
            >>>>>>>>>>> support it once we do this and we can't
            just say "oops" and flip it
            >>>>>>>>>>> back.
            >>>>>>>>>>>
            >>>>>>>>>>> It is unclear to me, if the multi-gigabyte
            ram buffer is really to
            >>>>>>>>>>> avoid merges because they are so slow and
            it would be DAYS otherwise,
            >>>>>>>>>>> or if its to avoid merges so it doesn't hit
            OOM.
            >>>>>>>>>>> Also from personal experience, it takes
            trial and error (means
            >>>>>>>>>>> experiencing OOM on merge!!!) before you
            get those heap values correct
            >>>>>>>>>>> for your dataset. This usually means
            starting over which is
            >>>>>>>>>>> frustrating and wastes more time.
            >>>>>>>>>>>
            >>>>>>>>>>> Jim mentioned some ideas about the memory
            usage in IndexWriter, seems
            >>>>>>>>>>> to me like its a good idea. maybe the
            multigigabyte ram buffer can be
            >>>>>>>>>>> avoided in this way and performance
            improved by writing bigger
            >>>>>>>>>>> segments with lucene's defaults. But this
            doesn't mean we can simply
            >>>>>>>>>>> ignore the horrors of what happens on
            merge. merging needs to scale so
            >>>>>>>>>>> that indexing really scales.
            >>>>>>>>>>>
            >>>>>>>>>>> At least it shouldnt spike RAM on trivial
            data amounts and cause OOM,
            >>>>>>>>>>> and definitely it shouldnt burn hours and
            hours of CPU in O(n^2)
            >>>>>>>>>>> fashion when indexing.
            >>>>>>>>>>>
            >>>>>>>>>>>
            
---------------------------------------------------------------------
            >>>>>>>>>>> To unsubscribe, e-mail:
            dev-unsubscr...@lucene.apache.org
            >>>>>>>>>>> For additional commands, e-mail:
            dev-h...@lucene.apache.org
            >>>>>>>>>>>
            >>>>>>>>>
            
---------------------------------------------------------------------
            >>>>>>>>> To unsubscribe, e-mail:
            dev-unsubscr...@lucene.apache.org
            >>>>>>>>> For additional commands, e-mail:
            dev-h...@lucene.apache.org
            >>>>>>>>>
            >>>>>>>
            >>>>>>> --
            >>>>>>> Adrien
            >>>>>>>
            >>>>>>>
            
---------------------------------------------------------------------
            >>>>>>> To unsubscribe, e-mail:
            dev-unsubscr...@lucene.apache.org
            >>>>>>> For additional commands, e-mail:
            dev-h...@lucene.apache.org
            >>>>>>>
            >>>>>>
            
---------------------------------------------------------------------
            >>>>>> To unsubscribe, e-mail:
            dev-unsubscr...@lucene.apache.org
            >>>>>> For additional commands, e-mail:
            dev-h...@lucene.apache.org
            >>>>>>
            >>>>
            >>>> --
            >>>> http://www.needhamsoftware.com (work)
            >>>> http://www.the111shift.com (play)
            >>
            >
            
---------------------------------------------------------------------
            > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
            > For additional commands, e-mail:
            dev-h...@lucene.apache.org
            >


            
---------------------------------------------------------------------
            To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
            For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [Proposal] Remove max number of dimensions for KNN vectors

Reply via email to