Beginners Digest, Vol 52, Issue 15

beginners-request Fri, 12 Oct 2012 06:50:46 -0700

Send Beginners mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."

Today's Topics:

   1.  calling inpure functions from pure code (Emmanuel Touzery)
   2.  TextMate with Haskell GHC (Patrick Lynch)
   3. Re:  calling inpure functions from pure code (Daniel Trstenjak)
   4. Re:  calling inpure functions from pure code (Emmanuel Touzery)
   5. Re:  calling inpure functions from pure code (David McBride)
   6. Re:  TextMate with Haskell GHC (Brandon Allbery)

----------------------------------------------------------------------

Message: 1
Date: Fri, 12 Oct 2012 12:47:31 +0200
From: Emmanuel Touzery <[email protected]>
Subject: [Haskell-beginners] calling inpure functions from pure code
To: [email protected]
Message-ID:
        <cac42rennmyedgooqnal9qu3em+g4xnfg_brie3b38nhzk7u...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hello,

 I'm trying to write my first real program in Haskell, a web page scraper;
at the top level, which is impure anyway, i fetch the top-level pages
(which is IO), then call the pure functions parsing the structure.
 So far so good. That was the first part of the program and up to this
point I'm pretty sure I did it right (for the structure anyway). Most of
the demo programs and Haskell books cover this: you do the IO at the top
level, then you process and you come back to the top level to print your
results for instance.

 But here comes the problem: these pure functions that parse the structure,
sometimes they find links and must open another page on the site... And
opening that new page, well it's IO and can't be pure.. Now if I call from
those pure methods IO methods, then they're not pure anymore, and in fact
since those are leaf calls basically my entire program becomes impure...

 I had this idea, that I would make some sort of input data structure,
which would be like a lazy String reading from a file: doing IO while the
caller doesn't even realize and the caller can be pure. So some sort of
fake webserver or website or htmlpage data structure... which is lazy. And
then I give this data structure to my pure methods which parse the data,
they call functions on that structure, and they can stay pure. But it
sounds really contrived and probably completely the wrong solution.

 I can see that this is a very basic question which was probably answered
hundreds of time, but I could not find the answer so far...

 Thank you!

Emmanuel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20121012/df5417df/attachment-0001.htm>

------------------------------

Message: 2
Date: Fri, 12 Oct 2012 08:13:05 -0400
From: "Patrick Lynch" <[email protected]>
Subject: [Haskell-beginners] TextMate with Haskell GHC
To: <[email protected]>
Message-ID: <CFC802D159DB43CDB42509F3EAE74775@UserPC>
Content-Type: text/plain; charset="iso-8859-1"

Good morning,
Can anyone tell me how to configure TextMate with the :edit command in ghci 
when used with a Mac [I'm new to the Mac]?
[BTW: TextMate has a "bundle" that works well with Haskell.]
Also, can anyone recommend an online course on Category Theory?
Good weekend
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20121012/a5334d2b/attachment-0001.htm>

------------------------------

Message: 3
Date: Fri, 12 Oct 2012 15:15:23 +0200
From: Daniel Trstenjak <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
To: Emmanuel Touzery <[email protected]>
Cc: [email protected]
Message-ID: <20121012131523.GA21680@machine>
Content-Type: text/plain; charset=us-ascii

Hi Emmanuel,

when parsing the string representing a page, you could
save all the links you encounter.

After the parsing you would load the linked pages and start
again parsing.

You would redo this until no more links are returned or a
maximum deepness is reached.

Greetings,
Daniel

------------------------------

Message: 4
Date: Fri, 12 Oct 2012 15:28:39 +0200
From: Emmanuel Touzery <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi,

> when parsing the string representing a page, you could
> save all the links you encounter.
>
> After the parsing you would load the linked pages and start
> again parsing.
>
> You would redo this until no more links are returned or a
> maximum deepness is reached.

Thanks for the tip. That sounds much more reasonable than what I 
mentioned. It seems a bit "spaghetti" to me though in a way (but maybe I 
just have to get used to the Haskell way).

To be more specific about what I want to do: I want to parse TV 
programs. On the first page I have the daily listing for a channel. 
start/end hour, title, category, and link or not.
To fully parse one TV program I can follow the link if it's present and 
get the extra info which is there (summary, pictures..).

So the first scheme that comes to mind is a method which takes the DOM 
tree of the daily page and returns the list of programs for that day.

Instead, what I must then do, is to return the incomplete programs: the 
data object would have the link filled in, if it's available, but the 
summary, picture... would be empty.
Then I have a "second pass" in the caller function, where for programs 
which have a link, I would fetch the extra page, and call a second 
function, which will fill in the extra data (thankfully if pictures are 
present I only store their URL so it would stop there, no need for a 
third pass for pictures).

It annoys me that the first function returns "incomplete" objects... It 
somehow feels wrong.

Now that I mentioned my problem with more details, maybe you can think 
of a better way of doing that?

And otherwise I guess this is the policy when writing Haskell code: 
absolutely avoid spreading impure/IO tainted code, even if it maybe 
negatively affects the general structure of the program?

Thanks again for the tip though! That's definitely what I'll do if 
nothing better is suggested. It is actually probably the best way to do 
that if you want to separate IO from "pure" code.

Emmanuel

------------------------------

Message: 5
Date: Fri, 12 Oct 2012 09:39:41 -0400
From: David McBride <[email protected]>
Subject: Re: [Haskell-beginners] calling inpure functions from pure
        code
To: Emmanuel Touzery <[email protected]>
Cc: [email protected]
Message-ID:
        <can+tr43_dt8ex54zkzhpdmi0hopbns0+bcep469uerb8cpu...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

There's a better option in my opinion.  Use the monad transformer
capability of the parser you are using (I'm assuming you are using parsec
for parsing).

If you check the hackage docs for parsec you'll see that the ParsecT is an
instance of MonadIO.  That means at any point during the parsing you can go
liftIO $ <any IO action> and use the result in your parsing.  Here's an
example of what that would might look like.

import Control.Monad.IO.Class
import Control.Monad (when)
import Text.Parsec
import Text.Parsec.Char

parseTvStuff :: (MonadIO m) => ParsecT String u m (Char,Maybe ())
parseTvStuff = do
  string "tvshow:"
  c <- anyChar
  morestuff <- if c == 'x'
    then fmap Just $ liftIO $ putStrLn "run an http request, parse the
result, and store the result in morestuff as a maybe"
    else return Nothing
  return (c,morestuff)

So you will run an http request if you get back something that seems like
it could be worth further parsing.  Then you just parse that stuff with a
separate parser and store it in your data structure and continue parsing
the rest of the first page with the original parser if you wish.

On Fri, Oct 12, 2012 at 9:28 AM, Emmanuel Touzery <[email protected]>wrote:

> Hi,
>
>
>  when parsing the string representing a page, you could
>> save all the links you encounter.
>>
>> After the parsing you would load the linked pages and start
>> again parsing.
>>
>> You would redo this until no more links are returned or a
>> maximum deepness is reached.
>>
>
> Thanks for the tip. That sounds much more reasonable than what I
> mentioned. It seems a bit "spaghetti" to me though in a way (but maybe I
> just have to get used to the Haskell way).
>
> To be more specific about what I want to do: I want to parse TV programs.
> On the first page I have the daily listing for a channel. start/end hour,
> title, category, and link or not.
> To fully parse one TV program I can follow the link if it's present and
> get the extra info which is there (summary, pictures..).
>
> So the first scheme that comes to mind is a method which takes the DOM
> tree of the daily page and returns the list of programs for that day.
>
> Instead, what I must then do, is to return the incomplete programs: the
> data object would have the link filled in, if it's available, but the
> summary, picture... would be empty.
> Then I have a "second pass" in the caller function, where for programs
> which have a link, I would fetch the extra page, and call a second
> function, which will fill in the extra data (thankfully if pictures are
> present I only store their URL so it would stop there, no need for a third
> pass for pictures).
>
> It annoys me that the first function returns "incomplete" objects... It
> somehow feels wrong.
>
> Now that I mentioned my problem with more details, maybe you can think of
> a better way of doing that?
>
> And otherwise I guess this is the policy when writing Haskell code:
> absolutely avoid spreading impure/IO tainted code, even if it maybe
> negatively affects the general structure of the program?
>
> Thanks again for the tip though! That's definitely what I'll do if nothing
> better is suggested. It is actually probably the best way to do that if you
> want to separate IO from "pure" code.
>
> Emmanuel
>
>
> ______________________________**_________________
> Beginners mailing list
> [email protected]
> http://www.haskell.org/**mailman/listinfo/beginners<http://www.haskell.org/mailman/listinfo/beginners>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20121012/9c0718ab/attachment-0001.htm>

------------------------------

Message: 6
Date: Fri, 12 Oct 2012 09:50:03 -0400
From: Brandon Allbery <[email protected]>
Subject: Re: [Haskell-beginners] TextMate with Haskell GHC
To: Patrick Lynch <[email protected]>
Cc: [email protected]
Message-ID:
        <CAKFCL4Xex-emRYJ3wFoZNfeWYwUiUXuy-f1tuSbFS8-=zhm...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

On Fri, Oct 12, 2012 at 8:13 AM, Patrick Lynch <[email protected]>wrote:

> **
> Good morning,
> Can anyone tell me how to configure TextMate with the :edit command in
> ghci when used with a Mac [I'm new to the Mac]?
>

http://manual.macromates.com/en/using_textmate_from_terminal.html

-- 
brandon s allbery kf8nh                               sine nomine associates
[email protected]                                  [email protected]
unix/linux, openafs, kerberos, infrastructure          http://sinenomine.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20121012/3e38b308/attachment.htm>

------------------------------

_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners

End of Beginners Digest, Vol 52, Issue 15
*****************************************

Beginners Digest, Vol 52, Issue 15

Reply via email to