On Thu, Feb 18, 2016 at 5:43 PM, Dan Bron <[email protected]> wrote: > Threads are nowhere near sufficient in the contemporary grid-computing > world, and yet we haven’t even implemented that minor piece of > parallelism in J (which has been around for 26 years at this point)!
It's not clear, to me, that threads are a useful abstraction for computation in J. You do sort of need them for managing some I/O issues, but they also complicate the computational side of things, severely. > Raul wrote: >> Meanwhile, it's relatively easy to work with parallel instances of J: >> it can make sense to use one J process per cpu, and to farm work out >> to multiple CPUs. > > I guess I’m challenging the assertion that it is “relatively easy to > work with parallel instances of J”. It is true in theory, of course, > but I take the very existence of frameworks like Hadoop and its myriad > competitors as proof (or at least very compelling evidence) that > practice differs from that theory. What problems have you encountered here? > All applications are, in theory, capable of being run in a highly > parallel fashion (just spin up multiple instances and connect them > together, through the file system if you have to). > > But if it were so easy you wouldn’t need giant companies to produce > sophisticated frameworks which consist of hundreds of thousands to > millions of lines of code — systems which large companies happily > pay top dollar for — if it were actually that easy. But I don't need them to do that. I kind of appreciate that they are doing that - that means that there's machines around for me to have fun with. But from my point of view the "need" here is more a social need than a practical need: You need to give people things to do. > So, my assertion that J has not really taken advantage of its native > notational advantage of explicitly avoiding describing the “how” (Guy > Steele’s principal point), and that all production applications for > the last three decades have been, and still are, fundamentally serial. I agree that we could do more with J's notation. But only if we are also willing to accept some constraints on our development efforts (otherwise nothing gets done). > PS: The question of “who foots the bill” for the hardware is a > distraction, at best. That’s a commercial decision driven by the > value to the bill-footer of processing his data rapidly and > efficiently. If the benefit outweighs the cost, he will happily foot > the bill. If it doesn’t, that’s not the language’s problem. I don't think it's ever that simple. A problem with the "economic utility" abstraction is the idea that real utility is measurable. In real life, people have multiple drives/goals/needs and they switch between them both when an existing one becomes satisfied and also when another one rises in importance. And we can only approximate an understanding of how other people think and work (mostly based on what we think about ourselves). And this gets us into issues of the world's economic woes. (Which, at their root, are as much about how people want to spend their time, and think they need to do, as anything else.) You're right that it's not the language's problem. But this is really about people. And this is about the problems people have, and the problems they anticipate. And, about how they have been raised. And... so on... (You do have a bug out bag prepared, for when the next climate disaster hits, right?) > The language’s problem is, currently, that there is no simple, > standard, and straightforward way to create a grid or cloud bill in > the first place (because J offers no simple, standard, and > straightforward way to take advantage of a grid or cloud to process > huge quantities of data, though its notation was explicitly designed > to permit that). And so far as I know, there are no current > initiatives designed to address that problem (including zero > initiatives led by me, the complainant). I do not know what you mean by "bill" here. But I've had decent success using s3 and ec2 (amazon's services). There are a variety of practical issues to deal with. But a basic model of pulling a chunk of data from s3, processing it in an ec2 instance, and putting the result back up on s3 seems to work fairly well. And, sure, you do need to organize that. But you can get your feet wet doing the organizing part manually (semi-automatic processing), and then automating things from there. And, ok, you're not going to do all of that from within J. I prefer working with an ubuntu image and the aws command line tools. And, I also have gotten used to using jq to pull data out of those command line tools (though that's not necessary, you can treat the json results as just being line oriented key/value pairs with a few bits of adjacency.) The other big practical issues seem to be the authentication issues, s3 "eventual consistency" and resource limits. > Let me wrap up with a quote from Roger Hui in comp.lang.apl: > > https://groups.google.com/d/msg/comp.lang.apl/6C5N0lbHtv8/kv2dsGbaGXIJ > >> A careful reading of the dictionary indicates that evaluation >> order within an operator is unspecified. Everything points to >> an unspecified order; nothing points to a particular order. You are thinking, I imagine, of something very different from how I have used J in a massively parallel system. In my experience, the cost of communication in a massively parallel system is significantly higher than the cost of computation. Orders of magnitude higher. That means that it does not currently seem to make sense to think of massive parallelism as a way to implement one of J's primitives. However, familiarity with J's primitives can be very useful when thinking about how to implement massive parallelism. > But I’ve been very careful — very careful — throughout my J career > to explicitly *not* rely on it, to construct my code in such a way > that it could run in parallel without my knowing. Because I was always > afraid that some day, it would. I guess what I’m saying is: I hope, > one day soon, my fears are justified. If someone were to implement such a variant J interpreter I expect that there would be other issues besides order of evaluation which would still need attention. -- Raul ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
