[EMAIL PROTECTED] wrote:
> Hi, I am looking for a HTML parser who can parse a given page into
> a DOM tree, and can reconstruct the exact original html sources.
> Strictly speaking, I should be allowed to retrieve the original
> sources at each internal nodes of the DOM tree.
> I have tried
On 2008-01-23, kliu <[EMAIL PROTECTED]> wrote:
> On Jan 23, 7:39 pm, "A.T.Hofkamp" <[EMAIL PROTECTED]> wrote:
>> On 2008-01-23, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>>
>> > Hi, I am looking for a HTML parser who can parse a given page into
>> > a DOM tree, and can reconstruct the exact
Hi,
kliu wrote:
> what I really need is the mapping between each DOM nodes and
> the corresponding original source segment.
I don't think that will be easy to achieve. You could get away with a parser
that provides access to the position of an element in the source, and then map
changes back into
On 23 Jan, 14:20, kliu <[EMAIL PROTECTED]> wrote:
>
> Thank u for your reply. but what I really need is the mapping between
> each DOM nodes and the corresponding original source segment.
At the risk of promoting unfashionable DOM technologies, you can at
least serialise fragments of the DOM in li
On Jan 23, 7:39 pm, "A.T.Hofkamp" <[EMAIL PROTECTED]> wrote:
> On 2008-01-23, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>
> > Hi, I am looking for a HTML parser who can parse a given page into
> > a DOM tree, and can reconstruct the exact original html sources.
>
> Why not keep a copy of th
On 2008-01-23, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> Hi, I am looking for a HTML parser who can parse a given page into
> a DOM tree, and can reconstruct the exact original html sources.
Why not keep a copy of the original data instead?
That would be VERY MUCH SIMPLER than trying to
Hi, I am looking for a HTML parser who can parse a given page into
a DOM tree, and can reconstruct the exact original html sources.
Strictly speaking, I should be allowed to retrieve the original
sources at each internal nodes of the DOM tree.
I have tried Beautiful Soup who is really nice