On Sunday 28 January 2007 09:14, Neil Mitchell wrote: > Hi Alistair, > > > > Is there a simple way to get the contents of a webpage using Haskell on > > > a Windows box? > > > > This isn't exactly what you want, but it gets you partway there. Not > > sure if LineBuffering or NoBuffering is the best option. Line > > buffering should be fine for just text output, but if you request a > > binary object (like an image) then you have to read exactly the number > > of bytes specified, and no more. > > This works great for haskell.org, unfortunately it doesn't work as > well with the rest of the web universe. > > With www.google.com I get: Program error: <handle>: IO.hGetChar: > illegal operation > > With www.slashdot.org I get: 501 Not Implemented returned > > www.msnbc.msn.com works fine. > > Any ideas why?
At the very least it's missing the HTTP version on the request line, and you almost always need to send a Host header. For a start you could try changing client to: client server port page = do h <- connectTo server (PortNumber port) hSetBuffering h NoBuffering putStrLn "send request" hPutStrLn h ("GET " ++ page ++ " HTTP/1.1\r") hPutStrLn h ("Host: " ++ server ++ "\r") hPutStrLn h "\r" hPutStrLn h "\r" putStrLn "wait for response" readResponse h putStrLn "" Note that I haven't tried this, or the rest of Alistair code at all, so the usual 30 day money back guarantee doesn't apply. It certainly won't handle redirects. > Are there any alternatives to read in a file off the > internet (i.e. wget but as a library) The http library sort of works most of the time, but there are several bugs that cause it to fail on many 'in the wild' webservers. HXT has a wrapper around a command line invocation of cURL. It works better. There is still a problem with redirects, but thats an easy enough fix. I doubt that it would be very easy to extract it from the surrounding HXT framework though. It would be nice to have a binding to libcurl. Daniel _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe