On 26 March 2014, Bram Moolenaar <b...@moolenaar.net> wrote:
>
> lcd wrote:
[...]
> > I'm not saying we should change how uniq() works now, but add an
> > optional parameter that would tell it to switch modes.
> >
> >     There is a case for "stable" uniques, where the list is left
> > in the initial order, but the second and subsequent copies of each
> > element are removed.  This can be accomplished in O(N log(N)) with
> > red-black trees.  That's roughly as fast as sorting, and red-black
> > trees are the right structure to use when comparisons are more
> > expensive than moving things around (which is largely the case in
> > VimL).
> >
> >     There's also another aspect: sort() and red-black trees require
> > total ordering, while removing consecutive dupes only requires
> > equality comparisons.  Total ordering is not always possible, so
> > perhaps we should also provide a "dumb" (that is, the naive O(N^2))
> > unique, for completeness.
> >
> >     To summarize: I'm suggesting we should add an optional
> > parametter to uniq():
> >
> > - "sorted" (default when absent) - remove consecutive dupes, O(N);
> > - "stable" - remove dupes while keeping the initial order, O(N log(N));
> > - "dumb" - same as stable but require only == comparisons, O(N^2).
> 
> I wonder how often these will be used.  The goal is not to provide every
> possible function, but to provide just enough for Vim plugin writers to
> make their stuff work.

    Well, in the particular case of "stable" uniques it isn't really a
matter of convenience, but rather a matter of being able to do it at all
for useful data sizes.  For lists of elements that can be meaningfully
stringified, things are not that bad, you can do something like this:

        function! StableUniq(list)
            let seen = {}
            let uniques = []
            for elem in a:list
                if !has_key(seen, elem)
                    let seen[elem] = 1
                    call add(uniques, elem)
                endif
            endfor
            return uniques
        endfunction

    However, this doesn't work if the elements are dicts.  Dicts can
be stringified, but equal dicts can produce different stringifications
because of the order of keys.  Good luck doing a stable update in VimL
for a list of, say, 1k dicts.

    So, realistically, where do you get a list of 1k dicts?  Why, I just
got a few from getloclist().

    FWIW, I have a preliminary implementation, and I ran a few
benchmarks.  Here are few numbers:

(1) C stable unique without a custom comparison function is 12% slower
    than the eqivalent sort;

(2) C stable unique with a custom comparison function is marginally
    faster (< 2%) than the equivalent sort;

(3) C stable unique without a custom comparison function, for 1 mil.
    random numbers, with ~40% duplicates: 4.7 s;

(4) VimL function StableUniq() above, for 1 mil. random numbers: 5.3 s;

(4) C stable unique for with a custom comparison function, for 1 mil.
    random numbers: 44.3 s;

(5) naive O(N^2) VimL stable unique for 1k dicts: 5.1 s.

    Case (5) was actually my motivation for getting involved in all
this.

    The "sorted" unique you just included in Vim is actually the least
interesting case: it's O(N), it's trivial to write in VimL, and just
about any implementation is fast enough.  As the good Dr. Chip points
out, it has no business being in Vim.  The interesting case is the
"stable" one.

> > > > On the other hand, the same trees (or perhaps splays,
> > > > implemented in the same header file) might also be useful for
> > > > speeding up operations involving hash keys.
> > >
> > > Balanced tree (red-black tree, splay tree...) are useful for
> > > ordered dictionaries, but not for hashes. I don't see the link
> > > between balanced trees and hashes.
> >
> >     When you need to update the value of an element in a hash you
> > have a key and you need to locate the element.  If the key is a
> > string, this is best done with splays. *shrug*
>
> We do want to include useful functions in Vim, but only those which
> more than a few people will actually use.
[...]

    Here I was actually referring to a speedup in the way (all) dicts
work.  It would affect (or rather benefit) everybody.  But, whetever;
there are things in life more interesting than optimising Vim dicts.

    /lcd

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to vim_dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Raspunde prin e-mail lui