Re: [Factor-talk] maximum of a seq

John Benediktsson Tue, 11 Dec 2018 13:27:00 -0800

Hi Peter,

The performance difficulties in your solution arise from using ``head*``
which makes a new sequence (allocations, copying into new memory, etc.),
rather than a lightweight slice which ``head-slice*`` does.


You are right that slices are not what you expect, you might find this
interesting:

    IN: scratchpad "abcdef" dup 3 head-slice CHAR: D suffix!
    --- Data stack:
    "abcDef"
    T{ slice f 0 3 "abcDef" }

In-place operations like ``set-nth`` and variations like ``suffix!``
operate on the underlying sequence.

The Clojure-like data structures are in ``persistent`` vocabulary.  See,
for example ``persistent.vectors``.

Some of your words like ``upper`` are in the standard library
(``ch>upper``).  Also, I found your version didn't remove the extra newline
in the input text, which resulted in an answer that was 1 too large.

This looked like a fun puzzle, and I like fun puzzles, so I thought I'd
implement it and maybe provide another solution which could help with
learning:

    USING: ascii fry kernel math sequences ;

You can tell if two characters match for a possible reaction, if they are
not the same, but are when made to lowercase.  I also really like your
subtract and absolute difference should equal 32 approach.

    : match? ( ch1 ch2 -- ? )
        2dup = [ 2drop f ] [ [ ch>lower ] bi@ = ] if ;

My approach uses a stack (either a ``vector`` when processing an array, or
an ``sbuf`` when processing a string).  If the stack is empty (``?last``
returns ``f``), we don't react, otherwise check for a match on the last
character in the stack.

    : react? ( accum ch -- ? )
        [ ?last ] dip over [ match? ] [ 2drop f ] if ;

An internal word that processes the whole string.  If we react, then we pop
the last character from the stack, otherwise push the new one onto the stack

    : (react) ( accum seq -- accum )
        [ 2dup react? [ drop dup pop* ] [ suffix! ] if ] each ;

The main word that creates a new resizable sequence to use as a stack, then
applies the reactions:

    : react ( seq -- seq' )
        [ [ length ] [ new-resizable ] [ (react) ] tri ] keep like ;

The part 2 question tries removing each letter, looking for the smallest
length of reactions:

    : remove-unit ( seq ch -- seq' )
        [ [ = ] [ ch>upper = ] 2bi or ] curry reject ;

    : shortest-react ( seq -- n )
        "abcdefghijklmnopqrstuvwxyz" [
            remove-unit react length
        ] with map infimum ;

I find that often I experiment with different approaches, sometimes using
locals and writing in a more applicative style, other times starting with
small components like ``match?`` and ``react?`` and then building up an
algorithm around it.  I do like iteration in the listener, but also find
saving my code and refactoring easier with a text file and doing F2 or
``refresh-all``, or using like you are FUEL.

Also, as for finding words.  We do want to continue improving
documentation, so if you have contributions there it would be great.  It's
sometimes useful to just browse a vocabulary and look at all the words in a
list.  Maybe we could think about adding the first sentence of its
documentation, if available, which would also help. You can see on the help
for ``min`` and ``max`` they have related words ``supremum`` and
``infimum``.  Maybe those need better names...

Best,
John.









On Tue, Dec 11, 2018 at 8:22 AM <pet...@riseup.net> wrote:

> On 2018-12-10 16:40, John Benediktsson wrote:
> > Using fry is convenient.  Due to how we bootstrap factor, we can't use
> > fry right now in "core" vocabularies, like sequences, so you'll see a
> > fair amount of curry and compose which are more or less what fry is
> > doing, just not as clean syntax.
> >
> > I love the REPL advent idea, go for it!  And as you learn or hit
> > roadblocks or have questions, please share.
> >
> > You can see how supremum-by is implemented by clicking on it in an
> > expression, or doing ``see``.  It uses ``after?`` instead of ``>``,
> > which allows it to compare other kinds of objects with ``<=>`` when
> > numbers are not passed in.
> >
> > Also, you might try using ``map-reduce`` which is a little faster than
> > using ``unclip-slice`` and ``reduce``, (and using ``_ bi@`` instead of
> > ``[ _ call ] bi@`` to simplify):
> >
> > : max-by ( seq quot -- result )
> >     [ ] swap '[ [ _ bi@ > ] 2keep ? ] map-reduce ; inline
> >
> > Best,
> > John.
> >
> > On Mon, Dec 10, 2018 at 6:46 AM Alexander Ilin <ajs...@yandex.ru>
> > wrote:
> >
> >> Hey there!
> >>
> >> Sure, you can ask us here. I'm on sick leave, so got plenty of
> >> time.
> >> Coincidentally, working on my Factor hobby project.
> >>
> >> What you're looking for is called supremum-by.
> >>
> >> Terminology is a bit uncommon. Look up supremum and infimum in the
> >> help system.
> >>
> >> 10.12.2018, 17:06, "pet...@riseup.net" <pet...@riseup.net>:
> >>> Hello fellow concatenative fans.
> >>>
> >>> I'm working my way through this year's advent of code[0]. I'm
> >> doing it
> >>> in factor with a small twist - everything happening inside the
> >> REPL.
> >>> Makes it more fun and challenging (for me), although I don't have
> >> any
> >>> code to show for it in the end.
> >>>
> >>> I'm trying to find the right tools for the job since there's a lot
> >> of
> >>> words already available. Sometimes I reinvent something before I
> >> find
> >>> it, e.g. I wrote a last-n before finding last*.
> >>>
> >>> Would it be OK if I sometimes ask you for tips on what would be
> >>> idiomatic/simpler/shorter/faster than what I have(n't) found?
> >>>
> >>> The first problem I really couldn't find is picking a maximum from
> >> a
> >>> sequence based on a quotation pulling out the key. An example will
> >> say
> >>> it the best I guess:
> >>>
> >>>> { { 1 2 } { 1 3 } { -1 4 } } [ second ] max-by .
> >>>
> >>> { -1 4 }
> >>>
> >>> Here is my definition:
> >>>
> >>> : max-by ( seq quot -- result ) [ unclip-slice ] dip '[ [ [ _ call
> >> ] bi@
> >>>> ] 2keep ? ] reduce ; inline
> >>>
> >>> Using fry for such a general word would be a bit of an overkill
> >> but for
> >>> demonstration purposes it's OK I guess.
> >>>
> >>> Is there no standard word that already does something similar? Or
> >> is
> >>> there a way to write this with the existing sequence combinators
> >> that
> >>> it's so short there was no need to create this generic word?
> >>>
> >>> --
> >>> ------------
> >>> Peter Nagy
> >>> ------------
> >>>
> >>> _______________________________________________
> >>> Factor-talk mailing list
> >>> Factor-talk@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/factor-talk
> >>
> >> ---=====---
> >> Александр
> >>
> >> _______________________________________________
> >> Factor-talk mailing list
> >> Factor-talk@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/factor-talk
> > _______________________________________________
> > Factor-talk mailing list
> > Factor-talk@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/factor-talk
>
> Jon, Alex, hi.
>
> Any reason why supremum-by and infimum-by are not on the "Searching
> sequences" doc page? If you agree I can add these together with supremum
> and infimum.
>
> I'll take a look at map-reduce, thanks. I switched from using the factor
> IDE REPL to emacs with Fuel because today I kept creating an infinite
> loop and couldn't break it for the life of me in the IDE. In emacs it
> was C-c C-c. Since I couldn't break it I didn't even know it's an
> infinite loop. That led me to open a file and save the definitions for
> debugging purposes, so I kind of broke my rules at this point. Oh well,
> at least I solved another "day" today, although I'm rather behind at
> this point, not enough time.
>
> As far as roadblocks go, here is my solution to the 5th day. (react) is
> ugly with t/f, that's where I had the infinite loop and just quickly
> hacked the solution in. It takes quite a while to run, showing profile
> output further down:
>
> USING: arrays io.encodings.utf8 io.files kernel math math.ranges
> prettyprint
> sequences ;
>
> IN: aoc
> : reacts? ( a b -- ? ) - abs 32 = ;
> : (react) ( seq -- seq ? ) dup [ last ] [ dup length 2 - swap nth ] bi
> reacts? [ 2 head* t ] [ f ] if ;
> : has-2? ( seq -- ? ) length 2 >= ;
> : react ( seq -- seq ) dup has-2? [ (react) [ react ] when ] when ;
> : (flow) ( seq seq -- seq seq ) unclip-slice swap [ suffix! ] dip ;
> : flow ( seq seq -- seq ) dup empty? [ drop ] [ (flow) [ react ] dip
> flow ] if ;
> : load5 ( -- str ) "5.txt" utf8 file-contents ;
> : solve5 ( seq seq -- len ) flow length ;
> : upper ( c -- c ) dup 96 > [ 32 - ] when ;
> : nounit ( seq u -- seq ) [ [ upper ] dip = not ] curry filter ;
> : do5-2 ( str u -- res ) [ nounit V{ } clone swap solve5 ] keep 2array ;
> : solve5-2 ( str -- res ) CHAR: A CHAR: Z [a,b] swap [ swap do5-2 ]
> curry map [ first ] infimum-by ;
>
> ----
>
> IN: aoc V{ } clone load5
>
> --- Data stack:
> V{ }
> "bBCRGgrwgKknNGWnNWUuwHoeEtTOvEeVhYZzrZJjFfzNnpPHtTgGhNtTxBeEb..."
> IN: aoc [ solve5 ] profile top-down profile.
> depth   time ms  GC %  JIT %  FFI %   FT %
>   15    4073.0   3.41   0.00   7.64   0.00 T{ thread f "Initial"
> ~quotation~ ~quotation~ 17 ~box~ f t f H{...
>   16    3821.0   0.60   0.00   1.99   0.00   react
>   17    3818.0   0.60   0.00   1.99   0.00     (react)
>   18    3811.0   0.60   0.00   1.99   0.00       subseq-unsafe-as
>   19    1275.0   0.00   0.00   0.00   0.00         M\ array nth-unsafe
>   20     694.0   0.00   0.00   0.00   0.00           M\ fixnum
> integer>fixnum
>   19    1065.0   0.00   0.00   0.00   0.00         M\ array
> set-nth-unsafe
>   20     132.0   0.00   0.00   0.00   0.00           M\ fixnum
> integer>fixnum
>   19     579.0   0.00   0.00   0.00   0.00         +
>   19     391.0   0.00   0.00   0.00   0.00         M\ growable
> set-nth-unsafe
>   20     155.0   0.00   0.00   0.00   0.00           M\ vector
> underlying>>
>   19     359.0   0.00   0.00   0.00   0.00         M\ growable
> nth-unsafe
>   20     169.0   0.00   0.00   0.00   0.00           M\ vector
> underlying>>
>   19      78.0  29.49   0.00  97.44   0.00         M\ vector
> new-sequence
>   20      77.0  29.87   0.00  98.70   0.00           <array>
>   20       1.0   0.00   0.00   0.00   0.00           M\ fixnum
> integer>fixnum
>   19       1.0   0.00   0.00   0.00   0.00         -
>   19       1.0   0.00   0.00   0.00   0.00         integer?
>   18       3.0   0.00   0.00   0.00   0.00       M\ vector like
>   18       1.0   0.00   0.00   0.00   0.00       head*
>   19       1.0   0.00   0.00   0.00   0.00         -
>   18       1.0   0.00   0.00   0.00   0.00       M\ sequence nth
>   19       1.0   0.00   0.00   0.00   0.00         M\ integer
> bounds-check?
>   18       1.0   0.00   0.00   0.00   0.00       M\ growable nth-unsafe
>   19       1.0   0.00   0.00   0.00   0.00         M\ vector
> underlying>>
>   18       1.0   0.00   0.00   0.00   0.00       head
>   17       1.0   0.00   0.00   0.00   0.00     has-2?
>   17       1.0   0.00   0.00   0.00   0.00     >=
>   16     251.0  46.22   0.00  93.63   0.00   (flow)
>   17     240.0  48.33   0.00  97.92   0.00     push
>   18     237.0  48.95   0.00  99.16   0.00       resize-array
>   17       4.0   0.00   0.00   0.00   0.00     M\ string nth-unsafe
>   18       2.0   0.00   0.00   0.00   0.00       M\ fixnum
> integer>fixnum
>   17       3.0   0.00   0.00   0.00   0.00     M\ virtual-sequence
> nth-unsafe
>   18       2.0   0.00   0.00   0.00   0.00       M\ slice virtual@
>   19       1.0   0.00   0.00   0.00   0.00         +
>   17       2.0   0.00   0.00   0.00   0.00     M\ slice length
>   17       1.0   0.00   0.00   0.00   0.00     >
>   17       1.0   0.00   0.00   0.00   0.00     integer?
>   16       1.0   0.00   0.00   0.00   0.00   M\ slice length
>
> subseq-unsafe-as seems to be the culprit, which shows up in the output
> only when I use nth in (react), so I guess nth is being inlined and I
> don't see where is this call coming from. This is basically getting the
> second to last value from a vector, I couldn't find a simpler and faster
> solution. Maybe a clean array would have less overhead but I'd have to
> rewrite the algo and keep a "real length" or pointer into it which might
> slow the whole thing down again.
>
> So, is nth really that slow? :)
>
> Another thing that tricked me:
>
> IN: aoc V{ 1 2 3 } 2 head-slice* 3 suffix! >array .
> { 1 }
> IN: aoc V{ 1 2 3 } rest-slice 3 suffix! >array .
> { 2 3 }
>
> Slices work differently than I expected :) Are there clojure-like data
> structures that are immutable and structurally share most of their
> content?
>
>
> --
> ------------
>   Peter Nagy
> ------------
>
>
> _______________________________________________
> Factor-talk mailing list
> Factor-talk@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/factor-talk
>

_______________________________________________
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk

Re: [Factor-talk] maximum of a seq

Reply via email to