Re: [Pharo-users] Porting Transducers to Pharo

Stephane Ducasse Fri, 02 Jun 2017 12:07:45 -0700

Hi steffen

This is a great news. We need cool frameworks.
- There is a package on cincom store to support the migration from VW to
Pharo. FileOuter something. The name escapes my mind now. We updated it
last year to help porting one application to Pharo.
- I can help producing a nice document :)


On Wed, May 31, 2017 at 2:23 PM, Steffen Märcker <merk...@web.de> wrote:

> Hi,
>
> I am the developer of the library 'Transducers' for VisualWorks. It was
> formerly known as 'Reducers', but this name was a poor choice. I'd like to
> port it to Pharo, if there is any interest on your side. I hope to learn
> more about Pharo in this process, since I am mainly a VW guy. And most
> likely, I will come up with a bunch of questions. :-)
>
> Meanwhile, I'll cross-post the introduction from VWnc below. I'd be very
> happy to hear your optinions, questions and I hope we can start a fruitful
> discussion - even if there is not Pharo port yet.
>
> Best, Steffen
>
>
>
> Transducers are building blocks that encapsulate how to process elements
> of a data sequence independently of the underlying input and output source.
>
>
>
> # Overview
>
> ## Encapsulate
> Implementations of enumeration methods, such as #collect:, have the logic
> how to process a single element in common.
> However, that logic is reimplemented each and every time. Transducers make
> it explicit and facilitate re-use and coherent behavior.
> For example:
> - #collect: requires mapping: (aBlock1 map)
> - #select: requires filtering: (aBlock2 filter)
>
>
> ## Compose
> In practice, algorithms often require multiple processing steps, e.g.,
> mapping only a filtered set of elements.
> Transducers are inherently composable, and hereby, allow to make the
> combination of steps explicit.
> Since transducers do not build intermediate collections, their composition
> is memory-efficient.
> For example:
> - (aBlock1 filter) * (aBlock2 map)   "(1.) filter and (2.) map elements"
>
>
> ## Re-Use
> Transducers are decoupled from the input and output sources, and hence,
> they can be reused in different contexts.
> For example:
> - enumeration of collections
> - processing of streams
> - communicating via channels
>
>
>
> # Usage by Example
>
> We build a coin flipping experiment and count the occurrence of heads and
> tails.
>
> First, we associate random numbers with the sides of a coin.
>
>     scale := [:x | (x * 2 + 1) floor] map.
>     sides := #(heads tails) replace.
>
> Scale is a transducer that maps numbers x between 0 and 1 to 1 and 2.
> Sides is a transducer that replaces the numbers with heads an tails by
> lookup in an array.
> Next, we choose a number of samples.
>
>     count := 1000 take.
>
> Count is a transducer that takes 1000 elements from a source.
> We keep track of the occurrences of heads an tails using a bag.
>
>     collect := [:bag :c | bag add: c; yourself].
>
> Collect is binary block (reducing function) that collects events in a bag.
> We assemble the experiment by transforming the block using the transducers.
>
>     experiment := (scale * sides * count) transform: collect.
>
>   From left to right we see the steps involved: scale, sides, count and
> collect.
> Transforming assembles these steps into a binary block (reducing function)
> we can use to run the experiment.
>
>     samples := Random new
>                   reduce: experiment
>                   init: Bag new.
>
> Here, we use #reduce:init:, which is mostly similar to #inject:into:.
> To execute a transformation and a reduction together, we can use
> #transduce:reduce:init:.
>
>     samples := Random new
>                   transduce: scale * sides * count
>                   reduce: collect
>                   init: Bag new.
>
> We can also express the experiment as data-flow using #<~.
> This enables us to build objects that can be re-used in other experiments.
>
>     coin := sides <~ scale <~ Random new.
>     flip := Bag <~ count.
>
> Coin is an eduction, i.e., it binds transducers to a source and
> understands #reduce:init: among others.
> Flip is a transformed reduction, i.e., it binds transducers to a reducing
> function and an initial value.
> By sending #<~, we draw further samples from flipping the coin.
>
>     samples := flip <~ coin.
>
> This yields a new Bag with another 1000 samples.
>
>
>
> # Basic Concepts
>
> ## Reducing Functions
>
> A reducing function represents a single step in processing a data sequence.
> It takes an accumulated result and a value, and returns a new accumulated
> result.
> For example:
>
>     collect := [:col :e | col add: e; yourself].
>     sum := #+.
>
> A reducing function can also be ternary, i.e., it takes an accumulated
> result, a key and a value.
> For example:
>
>     collect := [:dic :k :v | dict at: k put: v; yourself].
>
> Reducing functions may be equipped with an optional completing action.
> After finishing processing, it is invoked exactly once, e.g., to free
> resources.
>
>     stream := [:str :e | str nextPut: each; yourself] completing: #close.
>     absSum := #+ completing: #abs
>
> A reducing function can end processing early by signaling Reduced with a
> result.
> This mechanism also enables the treatment of infinite sources.
>
>     nonNil := [:res :e | e ifNil: [Reduced signalWith: res] ifFalse:
> [res]].
>
> The primary approach to process a data sequence is the reducing protocol
> with the messages #reduce:init: and #transduce:reduce:init: if transducers
> are involved.
> The behavior is similar to #inject:into: but in addition it takes care of:
> - handling binary and ternary reducing functions,
> - invoking the completing action after finishing, and
> - stopping the reduction if Reduced is signaled.
> The message #transduce:reduce:init: just combines the transformation and
> the reducing step.
>
> However, as reducing functions are step-wise in nature, an application may
> choose other means to process its data.
>
>
> ## Reducibles
>
> A data source is called reducible if it implements the reducing protocol.
> Default implementations are provided for collections and streams.
> Additionally, blocks without an argument are reducible, too.
> This allows to adapt to custom data sources without additional effort.
> For example:
>
>     "XStreams adaptor"
>     xstream := filename reading.
>     reducible := [[xstream get] on: Incomplete do: [Reduced signal]].
>
>     "natural numbers"
>     n := 0.
>     reducible := [n := n+1].
>
>
> ## Transducers
>
> A transducer is an object that transforms a reducing function into another.
> Transducers encapsulate common steps in processing data sequences, such as
> map, filter, concatenate, and flatten.
> A transducer transforms a reducing function into another via #transform:
> in order to add those steps.
> They can be composed using #* which yields a new transducer that does both
> transformations.
> Most transducers require an argument, typically blocks, symbols or numbers:
>
>     square := Map function: #squared.
>     take := Take number: 1000.
>
> To facilitate compact notation, the argument types implement corresponding
> methods:
>
>     squareAndTake := #squared map * 1000 take.
>
> Transducers requiring no argument are singletons and can be accessed by
> their class name.
>
>     flattenAndDedupe := Flatten * Dedupe.
>
>
>
> # Advanced Concepts
>
> ## Data flows
>
> Processing a sequence of data can often be regarded as a data flow.
> The operator #<~ allows define a flow from a data source through
> processing steps to a drain.
> For example:
>
>     squares := Set <~ 1000 take <~ #squared map <~ (1 to: 1000).
>     fileOut writeStream <~ #isSeparator filter <~ fileIn readStream.
>
> In both examples #<~ is only used to set up the data flow using reducing
> functions and transducers.
> In contrast to streams, transducers are completely independent from input
> and output sources.
> Hence, we have a clear separation of reading data, writing data and
> processing elements.
> - Sources know how to iterate over data with a reducing function, e.g.,
> via #reduce:init:.
> - Drains know how to collect data using a reducing function.
> - Transducers know how to process single elements.
>
>
> ## Reductions
>
> A reduction binds an initial value or a block yielding an initial value to
> a reducing function.
> The idea is to define a ready-to-use process that can be applied in
> different contexts.
> Reducibles handle reductions via #reduce: and #transduce:reduce:
> For example:
>
>     sum := #+ init: 0.
>     sum1 := #(1 1 1) reduce: sum.
>     sum2 := (1 to: 1000) transduce: #odd filter reduce: sum.
>
>     asSet := [:set :e | set add: e; yourself] initializer: [Set new].
>     set1 := #(1 1 1) reduce: asSet.
>     set2 := #(1 to: 1000) transduce: #odd filter reduce: asSet.
>
> By combining a transducer with a reduction, a process can be further
> modified.
>
>     sumOdds := sum <~ #odd filter
>     setOdds := asSet <~ #odd filter
>
>
> ## Eductions
>
> An eduction combines a reducible data sources with a transducer.
> The idea is to define a transformed (virtual) data source that needs not
> to be stored in memory.
>
>     odds1 := #odd filter <~ #(1 2 3) readStream.
>     odds2 := #odd filter <~ (1 to 1000).
>
> Depending on the underlying source, eductions can be processed once
> (streams, e.g., odds1) or multiple times (collections, e.g., odds2).
> Since no intermediate data is stored, transducers actions are lazy, i.e.,
> they are invoked each time the eduction is processed.
>
>
>
> # Origins
>
> Transducers is based on the same-named Clojure library and its ideas.
> Please see:
> http://clojure.org/transducers
>
>

Re: [Pharo-users] Porting Transducers to Pharo

Reply via email to