José Romildo Malaquias wrote: > Currently the application has an option to indirectly import movie data > from web pages. For that first the user should access the page in a web > browser. Then the user should copy the rendered text in the web browser > into an import window in my application and click an "import" button. In > response the application parses the given text and collects any relevant > data it knows about, using regular expressions. > > For instance, to get the director information from a movie in the > AllCenter web site I use the following regular expression: > > ^Direção:\s+(.+)$ > > I want to modify this scheme in order to eliminate the need to copy the > rendered text from a web browser. Instead my application should download > and parse the HTML page directly. > > Which libraries are available in Haskell that would make it easy to get > content information from a HTML document, in the way described above?
To parse HTML documents, I've had success with TagSoup in the past. You can take a look at the HTTP package to download the HTML from the server. Both packages are available from Hackage. HTH, Jochem -- Jochem Berndsen | joc...@functor.nl _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe