I've been watching this for a bit and was thinking that a combination of a
message bus (Rabbit MQ?) and Cro should provide most of what you'd need for
a backbone.

The fact that Raku has Supplies and Channels built in means it feels like a
problem that's easy enough to fix.

This is probably me coming from a position of not knowing rarely enough
about the problem space though.

On Mon, 29 Nov 2021 at 06:35, Piper H <pott...@gmail.com> wrote:

> William, I didn't use SparkR. I use R primarily for plotting.
>
> Spark's basic API is quite simple, it does the distributed computing of
> map, filter, group, reduce etc, which are all covered by perl's map, sort,
> grep functions IMO.
>
> for instance, this common statistics on Spark:
>
> >>> fruit.take(5)
> [('peach', 1), ('apricot', 2), ('apple', 3), ('haw', 1), ('persimmon', 9)]
> >>>
> >>>
> >>> fruit.filter(lambda x:x[0] == 'apple').reduceByKey(lambda
> x,y:x+y).collect()
> [('apple', 86)]
>
> Which is easily implemented by perl's grep and map functions.
> But we need a distributed computing framework of perl6.
>
> Yes there is already the perl-spark project:
> https://github.com/perl-spark/Spark
> Which didn't get updated for many years. I don't think it's still in
> active development.
>
> So I asked the original question.
>
> Thank you.
> Piper
>
>
> On Mon, Nov 29, 2021 at 1:44 PM William Michels <w...@caa.columbia.edu>
> wrote:
>
>> Hi Piper!
>>
>> Have you used SparkR (R on Spark)?
>>
>> https://spark.apache.org/docs/latest/sparkr.html
>>
>> I'm encouraged by the data-type mapping between R and Spark. It
>> suggests to me that with a reasonable Spark API, mapping data types
>> between Raku and Spark should be straightforward:
>>
>>
>> https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between-r-and-spark
>>
>> Best Regards,
>>
>> Bill.
>>
>>
>> On Sat, Nov 27, 2021 at 12:16 AM Piper H <pott...@gmail.com> wrote:
>> >
>> > I use perl5 everyday for data statistics.
>> > The scripts are running on a single server for the computing tasks.
>> > I also use R, which has the similar usage.
>> > When we face very large data, we change to Apache Spark for distributed
>> computing.
>> > Spark's interface languages (python, scala, even ruby) are not
>> flexible, but their computing capability is amazing, due to the whole
>> cluster contributing the computing powers.
>> > Yes I know perl5 is somewhat old, but in perl6 why won't we make that a
>> distributed computing framework like Spark? Then it will help a lot to the
>> data programmer who already knows perl.
>> > I expect a lot from this project.
>> >
>> > Thanks.
>> > Piper
>>
>

-- 
Simon Proctor
Cognoscite aliquid novum cotidie

http://www.khanate.co.uk/

Reply via email to