[issue12629] HTMLParser silently stops parsing with malformed attributes

2011-11-14 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
assignee:  - ezio.melotti

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12629
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12629] HTMLParser silently stops parsing with malformed attributes

2011-11-14 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 3c3009f63700 by Ezio Melotti in branch '2.7':
#1745761, #755670, #13357, #12629, #1200313: improve attribute handling in 
HTMLParser.
http://hg.python.org/cpython/rev/3c3009f63700

New changeset 16ed15ff0d7c by Ezio Melotti in branch '3.2':
#1745761, #755670, #13357, #12629, #1200313: improve attribute handling in 
HTMLParser.
http://hg.python.org/cpython/rev/16ed15ff0d7c

New changeset 426f7a2b1826 by Ezio Melotti in branch 'default':
#1745761, #755670, #13357, #12629, #1200313: merge with 3.2.
http://hg.python.org/cpython/rev/426f7a2b1826

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12629
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12629] HTMLParser silently stops parsing with malformed attributes

2011-11-14 Thread Ezio Melotti

Ezio Melotti ezio.melo...@gmail.com added the comment:

Fixed, thanks for the report!
Apparently the correct way to parse y z=o / is:
starttag y
attribute z with value 
attribute o with no value
So this is what HTMLParser does now.

--
resolution:  - fixed
stage: needs patch - committed/rejected
status: open - closed
versions: +Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12629
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12629] HTMLParser silently stops parsing with malformed attributes

2011-11-02 Thread Éric Araujo

Éric Araujo mer...@netwok.org added the comment:

 This is what Firefox seems to do.
I think more confidence would be good.  Doesn’t the HTML5 spec define that?  
Have you found their test suite?  Do you have more than one browser known to be 
compliant (trick: not sure there is even one)?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12629
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12629] HTMLParser silently stops parsing with malformed attributes

2011-11-02 Thread Ezio Melotti

Ezio Melotti ezio.melo...@gmail.com added the comment:

I haven't found anything in the HTML5 spec but I haven't looked closely.
I'll do some more research when I'll start working on an actual patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12629
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12629] HTMLParser silently stops parsing with malformed attributes

2011-11-01 Thread Ezio Melotti

Ezio Melotti ezio.melo...@gmail.com added the comment:

I think xy z=o //x should be parser as xy z= //x, and the o 
should be ignored.
xy z= //x should be parser as xy z= //x, and the last two  
should be ignored.  This is what Firefox seems to do.

Currently the parser doesn't seem to handle extraneous data in the start tag 
too well, because the locatestarttagend_tolerant regex looks for (more or less) 
well-formed attributes.
Attached a patch for test_htmlparser with the two examples provided by Kevin.

--
keywords: +patch
nosy: +ezio.melotti
stage:  - needs patch
Added file: http://bugs.python.org/file23579/issue12629.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12629
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12629] HTMLParser silently stops parsing with malformed attributes

2011-07-29 Thread Éric Araujo

Changes by Éric Araujo mer...@netwok.org:


--
nosy: +eric.araujo, r.david.murray

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12629
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12629] HTMLParser silently stops parsing with malformed attributes

2011-07-26 Thread Kevin Stock

Kevin Stock teo...@gmail.com added the comment:

A workaround is to call close() after feed(), which I supposed I should have 
done anyways. However, this does not resolve the issue that the two cases 
behave so differently. 

The code that causes the difference is lines 351-355 of parser.py, which also 
has a misleading comment stating it detects the / in a / ending (which is 
actually done at 334).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12629
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12629] HTMLParser silently stops parsing with malformed attributes

2011-07-24 Thread Kevin Stock

New submission from Kevin Stock teo...@gmail.com:

Given the input 'xy z=o //x', HTMLParser only detects the opening x 
tag, and then stops parsing. Ideally this should behave like the case 'xy 
z= //x' which raises an error and then can continue parsing the close x 
tag.

--
components: Library (Lib)
files: test.py
messages: 141051
nosy: teoryn
priority: normal
severity: normal
status: open
title: HTMLParser silently stops parsing with malformed attributes
type: behavior
versions: Python 3.2, Python 3.3
Added file: http://bugs.python.org/file22745/test.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12629
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com