Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread Tatsuhiko Miyagawa
On Wed, Apr 1, 2009 at 6:21 PM, Toby Wintermute wrote: > Thanks, Web::Scraper looks quite neat. > However I want to avoid applications breaking on random CPAN module > upgrades (as just happened with the XML::LibXML upgrade yesterday), so > I might steer clear of it until it loses the big, bold w

Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread Toby Wintermute
2009/4/2 mirod : > Tatsuhiko Miyagawa wrote: >> >> On Tue, Mar 31, 2009 at 10:45 PM, Toby Wintermute >> wrote: >>> >>> The problem occurs when the html contains (the commonly used) & symbol >>> within attributes, such as: >>> [snip] > > Indeed when I tested the various ways to get XML from HTML,

Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread Toby Wintermute
2009/4/1 Tatsuhiko Miyagawa : > On Tue, Mar 31, 2009 at 10:45 PM, Toby Wintermute wrote: >> The problem occurs when the html contains (the commonly used) & symbol >> within attributes, such as: >> >> >> I know that really one should escape the ampersand in those >> circumstances, however real-wor

Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread Toby Wintermute
2009/4/1 Tatsuhiko Miyagawa : > On Wed, Apr 1, 2009 at 2:53 AM, Dave Cross wrote: >>> I know that really one should escape the ampersand in those >>> circumstances, however real-world web-pages rarely do this.. And this >>> behaviour was tolerated in XML::LibXML 1.66, just not subsequent >>> versi

Re: tell them about the honey

2009-04-01 Thread Dave Hodgkinson
On 1 Apr 2009, at 09:18, Dave Thorn wrote: Hi, Does anyone know who it was selling honey at the November (I believe) london.pm meet? I'd be ver' grateful. That would be Craig Knox. -- Dave HodgkinsonMSN: daveh...@hotmail.com Site: http://www.davehodgkinson.c

tell them about the honey

2009-04-01 Thread Dave Thorn
Hi, Does anyone know who it was selling honey at the November (I believe) london.pm meet? I'd be ver' grateful. Thanks, -- dave thorn (dtg)

How we see CVs

2009-04-01 Thread Paul Makepeace
This is great, http://www.hanovsolutions.com/resume_comic.png P

Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread mirod
Tatsuhiko Miyagawa wrote: On Tue, Mar 31, 2009 at 10:45 PM, Toby Wintermute wrote: The problem occurs when the html contains (the commonly used) & symbol within attributes, such as: I know that really one should escape the ampersand in those circumstances, however real-world web-pages rarely

Re: Summer of code

2009-04-01 Thread Daniel Ruoso
Em Qui, 2009-03-19 às 08:24 +, Léon Brocard escreveu: > Once again, Perl was accepted into the Google Summer of Code. If you > know any students please suggest that they spend a few weeks hacking > an open source Perl project in return for 4500 USD. > http://socghop.appspot.com/ > http://socgho

Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread Pedro Figueiredo
On 1 Apr 2009, at 06:45, Toby Wintermute wrote: Alternatively.. what do YOU use to parse real-world websites that are often not totally valid? If it's a quick hack I'll use HTML::Tidy like so: my $tidy = HTML::Tidy->new({ output_xhtml => 1, numeric_entities => 1, }); $tidy->ignore(

Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread peter
Quoting Dave Cross : Toby Wintermute wrote: What you're trying to parse isn't XML. Therefore you shouldn't expect to be able to parse it with an XML parser. Alternatively.. what do YOU use to parse real-world websites that are often not totally valid? A similar problem is when writing an XML e

Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread Paul Makepeace
On Wed, Apr 1, 2009 at 10:53 AM, Dave Cross wrote: > Or, alternatively, you could try the (badly named) XML::Liberal which parses > stuff that isn't really XML. This module has just been re-released as XML::Liberal::Neo with the explicit design goal of fostering the development of XML-like forma

Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread Tatsuhiko Miyagawa
On Wed, Apr 1, 2009 at 2:53 AM, Dave Cross wrote: >> I know that really one should escape the ampersand in those >> circumstances, however real-world web-pages rarely do this.. And this >> behaviour was tolerated in XML::LibXML 1.66, just not subsequent >> versions.. but eh, maybe it's just the wa

Re: [ANNOUNCE] London.pm technical meeting about "Less code" on 16th April 2009

2009-04-01 Thread Léon Brocard
2009/4/1 Dermot : >> http://londonpmtech.appspot.com/ > Is this happening on the Thursday (16th) or the Friday (17th)? Just to confirm, like the website says "London.pm technical meeting 16th April 2009". Thursday are traditional. 10:54 <@davorg> Looks like acme is starting to suffer from the

Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread Tatsuhiko Miyagawa
On Tue, Mar 31, 2009 at 10:45 PM, Toby Wintermute wrote: > The problem occurs when the html contains (the commonly used) & symbol > within attributes, such as: > > > I know that really one should escape the ampersand in those > circumstances, however real-world web-pages rarely do this.. And this

Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread Dave Cross
Toby Wintermute wrote: I know that really one should escape the ampersand in those circumstances, however real-world web-pages rarely do this.. And this behaviour was tolerated in XML::LibXML 1.66, just not subsequent versions.. but eh, maybe it's just the way I'm calling the parser? Sounds li

Re: [ANNOUNCE] London.pm technical meeting about "Less code" on 16th April 2009

2009-04-01 Thread Dermot
2009/4/1 Léon Brocard : > The next technical meeting will be on the 17th April 2009 from 7pm to > 9pm (you may arrive from 6.30pm, sign in at the reception) and Is this happening on the Thursday (16th) or the Friday (17th)? Dp.

Re: Schema into diagrams

2009-04-01 Thread Steve Mynott
On Wed, Apr 01, 2009 at 08:58:22AM +0100, Barry Walsh typed: > OmniGraffle 5 switched to using GraphViz for its engine so I would hope > that it could import GraphViz files with ease (can't confirm this > because I'm still on OmniGraffle 4) It didn't work when I tried it. -- Steve Mynott

Re: XML::LibXML and HTML (in >=v1.67)

2009-04-01 Thread Peter Corlett
On Wed, Apr 01, 2009 at 04:45:28PM +1100, Toby Wintermute wrote: [...] > I know that really one should escape the ampersand in those circumstances, > however real-world web-pages rarely do this.. And this behaviour was > tolerated in XML::LibXML 1.66, just not subsequent versions.. but eh, > maybe

[ANNOUNCE] London.pm technical meeting about "Less code" on 16th April 2009

2009-04-01 Thread Léon Brocard
London Perl Mongers organises technical meetings every two months. The technical meetings are a chance to find out what has been going on in the Perl community, what techniques people are using and how Perl integrates with other software. The next technical meeting will be on the 17th April 2009 f

Re: Schema into diagrams

2009-04-01 Thread Barry Walsh
Paul Makepeace wrote: On Fri, Mar 27, 2009 at 4:58 PM, Dave Cross wrote: Paul Makepeace wrote: Do people have a favorite MySQL schema -> (ER)diagram tool? Basically a quick way of visualising a database. Ideally one that sucks out the schema from the db itself, altho' I guess a mysqldu