Re: [Jchat] Four Solutions to a Trivial Problem

Raul Miller Thu, 18 Feb 2016 15:57:12 -0800

On Thu, Feb 18, 2016 at 5:43 PM, Dan Bron <[email protected]> wrote:
> Threads are nowhere near sufficient in the contemporary grid-computing
> world, and yet we haven’t even implemented that minor piece of
> parallelism in J (which has been around for 26 years at this point)!

It's not clear, to me, that threads are a useful abstraction for computation
in J.

You do sort of need them for managing some I/O issues, but they also
complicate the computational side of things, severely.

> Raul wrote:
>> Meanwhile, it's relatively easy to work with parallel instances of J:
>> it can make sense to use one J process per cpu, and to farm work out
>> to multiple CPUs.
>
> I guess I’m challenging the assertion that it is “relatively easy to
> work with parallel instances of J”. It is true in theory, of course,
> but I take the very existence of frameworks like Hadoop and its myriad
> competitors as proof (or at least very compelling evidence) that
> practice differs from that theory.

What problems have you encountered here?

> All applications are, in theory, capable of being run in a highly
> parallel fashion (just spin up multiple instances and connect them
> together, through the file system if you have to).
>
> But if it were so easy you wouldn’t need giant companies to produce
> sophisticated frameworks which consist of hundreds of thousands to
> millions of lines of code — systems which large companies happily
> pay top dollar for — if it were actually that easy.

But I don't need them to do that.

I kind of appreciate that they are doing that - that means that
there's machines around for me to have fun with. But from my point of
view
the "need" here is more a social need than a practical need:

You need to give people things to do.

> So, my assertion that J has not really taken advantage of its native
> notational advantage of explicitly avoiding describing the “how” (Guy
> Steele’s principal point), and that all production applications for
> the last three decades have been, and still are, fundamentally serial.

I agree that we could do more with J's notation.

But only if we are also willing to accept some constraints on our
development efforts (otherwise nothing gets done).

> PS: The question of “who foots the bill” for the hardware is a
> distraction, at best. That’s a commercial decision driven by the
> value to the bill-footer of processing his data rapidly and
> efficiently. If the benefit outweighs the cost, he will happily foot
> the bill. If it doesn’t, that’s not the language’s problem.

I don't think it's ever that simple.

A problem with the "economic utility" abstraction is the
idea that real utility is measurable. In real life, people
have multiple drives/goals/needs and they switch between
them both when an existing one becomes satisfied and also
when another one rises in importance.

And we can only approximate an understanding of how other
people think and work (mostly based on what we think
about ourselves).

And this gets us into issues of the world's economic
woes. (Which, at their root, are as much about how people
want to spend their time, and think they need to do,
as anything else.)

You're right that it's not the language's problem. But this
is really about people. And this is about the problems people
have, and the problems they anticipate. And, about how they
have been raised. And... so on...

(You do have a bug out bag prepared, for when the next
climate disaster hits, right?)

> The language’s problem is, currently, that there is no simple,
> standard, and straightforward way to create a grid or cloud bill in
> the first place (because J offers no simple, standard, and
> straightforward way to take advantage of a grid or cloud to process
> huge quantities of data, though its notation was explicitly designed
> to permit that). And so far as I know, there are no current
> initiatives designed to address that problem (including zero
> initiatives led by me, the complainant).

I do not know what you mean by "bill" here.

But I've had decent success using s3 and ec2 (amazon's
services). There are a variety of practical issues to deal
with. But a basic model of pulling a chunk of data from s3,
processing it in an ec2 instance, and putting the result back
up on s3 seems to work fairly well.

And, sure, you do need to organize that. But you can get
your feet wet doing the organizing part manually (semi-automatic
processing), and then automating things from there.

And, ok, you're not going to do all of that from within J.

I prefer working with an ubuntu image and the aws command
line tools. And, I also have gotten used to using jq
to pull data out of those command line tools (though that's
not necessary, you can treat the json results as just being
line oriented key/value pairs with a few bits of adjacency.)

The other big practical issues seem to be the authentication
issues, s3 "eventual consistency" and resource limits.

> Let me wrap up with a quote from Roger Hui in comp.lang.apl:
>
> https://groups.google.com/d/msg/comp.lang.apl/6C5N0lbHtv8/kv2dsGbaGXIJ
>
>> A careful reading of the dictionary indicates that evaluation
>> order within an operator is unspecified.  Everything points to
>> an unspecified order; nothing points to a particular order.

You are thinking, I imagine, of something very different
from how I have used J in a massively parallel system.

In my experience, the cost of communication in a massively
parallel system is significantly higher than the cost of
computation. Orders of magnitude higher.

That means that it does not currently seem to make sense
to think of massive parallelism as a way to implement
one of J's primitives.

However, familiarity with J's primitives can be very
useful when thinking about how to implement massive
parallelism.

> But I’ve been very careful — very careful — throughout my J career
> to explicitly *not* rely on it, to construct my code in such a way
> that it could run in parallel without my knowing. Because I was always
> afraid that some day, it would.  I guess what I’m saying is: I hope,
> one day soon, my fears are justified.

If someone were to implement such a variant J interpreter I
expect that there would be other issues besides order of
evaluation which would still need attention.

-- 
Raul
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jchat] Four Solutions to a Trivial Problem

Reply via email to