"John A. De Goes" <j...@n-brain.net> writes: > I forgot about links. In that case, consider: getUniqueFilesInDirRecursive. > > Attacking irrelevant details in an argument is often called a "strawman > attack". Such attacks are > pointless because they do not address the real substance of the issue. My > example is easily > modified to avoid the issues you raise. > > Consider the fact that many file-based operations _can and are parallelized > manually by > developers_. The challenge for next generation language and effect system > designers is to figure > out _how_ such operations can be automatically parallelized, given > sufficient constraints, > high-level constructs, and a powerful effect system. > > Saying, "I don't know exactly how it will look," is quite a bit different > from saying "It can't > be done." I claim the former. > > Regards, I am sorry, but it's not about details, but about the essence. My point was that there are a lot of subtle issues when we're dealing with (e.g.) a file system in a real-world manner. I have no doubt that it is possible to develop a sound logical system that would cover them and then encode it as a part of the type system of a language. That will probably lead to compile-time detection of a wider class of errors. But the problem is that one (IMHO) cannot deal with these subtleties in a generic manner. That is, there are things that may be done in parallel, and there are things that may not -- and it depends on the actual task you want to perform. So basically instead of manually parallelizing something, you'll (IMHO) end up writing complex type annotations so that a compiler could parallelize it on its own behalf. As somebody who have a certain experience with formal methods, I doubt that the latter will ever be simplier than the former.
> John A. De Goes > N-Brain, Inc. > The Evolution of Collaboration > > http://www.n-brain.net | 877-376-2724 x 101 > > On Aug 16, 2009, at 12:38 AM, Artem V. Andreev wrote: > >> "John A. De Goes" <j...@n-brain.net> writes: >> >>> On Aug 15, 2009, at 6:36 AM, Jason Dusek wrote: >>>> 2009/08/14 John A. De Goes <j...@n-brain.net>: >>>>> Hmmm, my point (perhaps I wasn't clear), is that different >>>>> effects have different commutability properties. In the case >>>>> of a file system, you can commute two sequential reads from >>>>> two different files. >>>> >>>> I think this is a bad example -- it's not something that's >>>> safe in general and discredits your idea. How would the >>>> compiler even know that two files are not actually the same >>>> file? >>> >>> I don't think the file system is the best example. However, I do think >>> it's a reasonable one. >>> >>> Let's say the type of the function getFilesInDir is annotated in such a >>> way as to tell the >>> effect >>> system that every file in the returned array is unique. Further, let's >>> say the type of the >>> function makeNewTempFile is annotated in such a way as to tell the effect >>> system that the >>> function will succeed in creating a new temp file with a name unique from >>> any other existing >>> file. >> Sorry, but this example is ridiculuous. While file *names* in this case >> might be reasonably >> assumed >> to be unique, the *files* themselves may not. Any modern filesystem does >> support file aliasing, >> and usually several forms thereof. And what does makeNewTempFile function >> do? Does it create a >> new >> file like POSIX mktemp() and return its name, or does it rather behave as >> POSIX mkstemp()? >> The first case is a well known security hole, and the second case does not, >> as it seems to me, >> fit >> well into the rest of your reasoning. >> >> However, let's consider further file system tree traversal. In some cases >> you might not care, >> whether >> some of the directories you descend into are actually the same directory, >> so your proposed >> optimization >> would be `safe'. However, in other cases sequential traversal would work, >> while a parallelized >> version >> would not, unless special additional measures are taken. E.g. consider a >> case of a build >> system. It >> traverses a source tree, finds sources files and if corresponding object >> files are non-existent >> or >> outdated, does something to regenerate them. Now if you have a directory >> that's actually a link >> to >> another directory, and you do sequential traversal, everything is fine: you >> descend into the >> directory >> the first time, build everything there and when you descend into it the >> second time, there's >> just nothing >> to do. If you do parallel traversal, you may well end up in the situation >> where two threads >> check >> simultaneously for an object file, discover it's outdated and run two build >> processes >> simultaneously, >> with the most likely effect of corrupted object file. >> >> >>> Then if you write a recursive function that loops through all files in a >>> directory, and for >>> each >>> file, it parses and compiles the file into a new temp file, then a >>> sufficiently sophisticated >>> compiler should be able to safely transform the recursion into parallel >>> parsing and >>> compilation >>> -- in a way that's provably correct, assuming the original program was >>> correct. >>> >>> The promise of a language with a purely functional part and a powerful >>> effect system for >>> everything else is very great. And very important in the massively >>> concurrent world we are >>> entering. >>> >>>> Well, yes -- which sounds like, there are no guarantees >>>> in general. Something that works half the time leaves you with >>>> two responsibilities -- the old responsibility of the work you >>>> did when you didn't have it and the new responsibility of >>>> knowing when it applies and when it doesn't. >>> >>> In the other thread, I brought up the example of buffering reads. Library >>> authors make the >>> decision to buffer for one reason: because if some other program is >>> messing with the data, >>> you're >>> screwed no matter what. >>> >>> And yeah, "they might be screwing with the data in just the way you need >>> it to be screwed >>> with," >>> (Sebastian), in which case my advice is use C and hope for the best. :-) >>> >>> Regards, >>> >>> John A. De Goes >>> N-Brain, Inc. >>> The Evolution of Collaboration >>> >>> http://www.n-brain.net | 877-376-2724 x 101 >>> >>> >>> _______________________________________________ >>> Haskell-Cafe mailing list >>> Haskell-Cafe@haskell.org >>> http://www.haskell.org/mailman/listinfo/haskell-cafe >>> >> >> -- >> >> S. Y. A(R). A. >> _______________________________________________ >> Haskell-Cafe mailing list >> Haskell-Cafe@haskell.org >> http://www.haskell.org/mailman/listinfo/haskell-cafe > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > -- S. Y. A(R). A. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe