"John A. De Goes" <j...@n-brain.net> writes:

> I forgot about links. In that case, consider:  getUniqueFilesInDirRecursive.
>
> Attacking irrelevant details in an argument is often called a  "strawman 
> attack". Such attacks are
> pointless because they do not  address the real substance of the issue. My 
> example is easily
> modified  to avoid the issues you raise.
>
> Consider the fact that many file-based operations _can and are  parallelized 
> manually by
> developers_. The challenge for next  generation language and effect system 
> designers is to figure
> out _how_  such operations can be automatically parallelized, given 
> sufficient  constraints,
> high-level constructs, and a powerful effect system.
>
> Saying, "I don't know exactly how it will look," is quite a bit  different 
> from saying "It can't
> be done." I claim the former.
>
> Regards,
I am sorry, but it's not about details, but about the essence. My point was 
that there are a lot
of subtle issues when we're dealing with (e.g.) a file system in a real-world 
manner.  
I have no doubt that it is possible to develop a sound logical system that 
would cover them and
then encode it as a part of the type system of a language. That will probably 
lead to 
compile-time detection of a wider class of errors. But the problem is that one 
(IMHO) cannot 
deal with these subtleties in a generic manner. That is, there are things that 
may be done in
parallel, and there are things that may not -- and it depends on the actual 
task you want to
perform. So basically instead of manually parallelizing something, you'll 
(IMHO) end up writing
complex type annotations so that a compiler could parallelize it on its own 
behalf. 
As somebody who have a certain experience with formal methods, I doubt that the 
latter will ever
be simplier than the former.


> John A. De Goes
> N-Brain, Inc.
> The Evolution of Collaboration
>
> http://www.n-brain.net    |    877-376-2724 x 101
>
> On Aug 16, 2009, at 12:38 AM, Artem V. Andreev wrote:
>
>> "John A. De Goes" <j...@n-brain.net> writes:
>>
>>> On Aug 15, 2009, at 6:36 AM, Jason Dusek wrote:
>>>> 2009/08/14 John A. De Goes <j...@n-brain.net>:
>>>>> Hmmm, my point (perhaps I wasn't clear), is that different
>>>>> effects have different commutability properties. In the case
>>>>> of a file system, you can commute two sequential reads from
>>>>> two different files.
>>>>
>>>> I think this is a bad example -- it's not something that's
>>>> safe in general and discredits your idea. How would the
>>>> compiler even know that two files are not actually the same
>>>> file?
>>>
>>> I don't think the file system is the best example. However, I do  think  
>>> it's a reasonable one.
>>>
>>> Let's say the type of the function getFilesInDir is annotated in  such  a 
>>> way as to tell the
>>> effect
>>> system that every file in the returned  array is unique. Further,  let's 
>>> say the type of the
>>> function  makeNewTempFile is annotated in such a way as to tell the  effect 
>>>  system that the
>>> function will succeed in creating a new temp file with  a name  unique from 
>>> any other existing
>>> file.
>> Sorry, but this example is ridiculuous. While file *names* in this  case 
>> might be reasonably
>> assumed
>> to be unique, the *files* themselves may not. Any modern filesystem  does 
>> support file aliasing,
>> and usually several forms thereof. And what does makeNewTempFile  function 
>> do? Does it create a
>> new
>> file like POSIX mktemp() and return its name, or does it rather  behave as 
>> POSIX mkstemp()?
>> The first case is a well known security hole, and the second case  does not, 
>> as it seems to me,
>> fit
>> well into the rest of your reasoning.
>>
>> However, let's consider further file system tree traversal. In some  cases 
>> you might not care,
>> whether
>> some of the directories you descend into are actually the same  directory, 
>> so your proposed
>> optimization
>> would be `safe'. However, in other cases sequential traversal would  work, 
>> while a parallelized
>> version
>> would not, unless special additional measures are taken. E.g.  consider a 
>> case of a build
>> system. It
>> traverses a source tree, finds sources files and if corresponding  object 
>> files are non-existent
>> or
>> outdated, does something to regenerate them. Now if you have a  directory 
>> that's actually a link
>> to
>> another directory, and you do sequential traversal, everything is  fine: you 
>> descend into the
>> directory
>> the first time, build everything there and when you descend into it  the 
>> second time, there's
>> just nothing
>> to do. If you do parallel traversal, you may well end up in the  situation 
>> where two threads
>> check
>> simultaneously for an object file, discover it's outdated and run  two build 
>> processes
>> simultaneously,
>> with the most likely effect of corrupted object file.
>>
>>
>>> Then if you write a recursive function that loops through all files  in  a 
>>> directory, and for
>>> each
>>> file, it parses and compiles the file into a  new temp file, then a  
>>> sufficiently sophisticated
>>> compiler should be  able to safely transform the recursion into  parallel 
>>> parsing and
>>> compilation
>>> -- in a way that's provably correct, assuming the original  program  was 
>>> correct.
>>>
>>> The promise of a language with a purely functional part and a  powerful  
>>> effect system for
>>> everything else is very great. And very important in  the massively  
>>> concurrent world we are
>>> entering.
>>>
>>>> Well, yes -- which sounds like, there are no guarantees
>>>> in general. Something that works half the time leaves you with
>>>> two responsibilities -- the old responsibility of the work you
>>>> did when you didn't have it and the new responsibility of
>>>> knowing when it applies and when it doesn't.
>>>
>>> In the other thread, I brought up the example of buffering reads.   Library 
>>> authors make the
>>> decision to buffer for one reason: because if  some other program  is 
>>> messing with the data,
>>> you're
>>> screwed no matter  what.
>>>
>>> And yeah, "they might be screwing with the data in just the way  you  need 
>>> it to be screwed
>>> with,"
>>> (Sebastian), in which case my advice is  use C and hope for the  best. :-)
>>>
>>> Regards,
>>>
>>> John A. De Goes
>>> N-Brain, Inc.
>>> The Evolution of Collaboration
>>>
>>> http://www.n-brain.net    |    877-376-2724 x 101
>>>
>>>
>>> _______________________________________________
>>> Haskell-Cafe mailing list
>>> Haskell-Cafe@haskell.org
>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>
>>
>> -- 
>>
>>                                      S. Y. A(R). A.
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe@haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

-- 

                                        S. Y. A(R). A.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to