[PHP] Re: modify an element of a HTML tag within a string

2002-07-18 Thread Monty

Okay, this is just very general info to help point you in the right
direction, but, here are some functions you'll probably need to accomplish
this:

  * Regular Expressions
  * pregi() and/or eregi()
  * explode() and implode()
  * str_replace()

Regular expressions will probably be the most important part of doing what
you need to do. If you don't already know how they work, they are hard to
grasp at first, but, very useful once you understand their purpose.

Monty


 From: [EMAIL PROTECTED] (Justin French)
 Newsgroups: php.general
 Date: Fri, 19 Jul 2002 13:50:08 +1000
 To: php [EMAIL PROTECTED]
 Subject: modify an element of a HTML tag within a string
 
 Hi all,
 
 I've asked simular questions before, but I think I'm finally asking the
 *right* question to get the right answer.
 
 I'm look for some suggestions on the best method of parsing a HTML document
 (or part thereof), with the view of CAPTURING and MODIFYING a specific
 element of a specific tag.
 
 something like:
 
 1. look for a given tag eg DIV
 2. capture the tag (everything from 'DIV' up to the '')
 3. look for a given attribute (eg ID=foo, ID=foo, ID='foo' -- all valid
 ways)
 4. capture it
 5. be given the opportunity to manipulate the attribute's value, delete it,
 etc
 6. place captured tag (complete with modifed elements) back into the string
 in it's original position
 7. return to step 1, looking for the next occurence of a DIV tag
 
 
 I really don't know where to start.  I REALLY don't expect someone to write
 this for me, just some guidance would be great -- or maybe some inspiration
 :)
 


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




[PHP] Re: modify an element of a HTML tag within a string

2002-07-18 Thread Richard Lynch

I've asked simular questions before, but I think I'm finally asking the
*right* question to get the right answer.

That's often the tricky part :-)

I'm look for some suggestions on the best method of parsing a HTML document
(or part thereof), with the view of CAPTURING and MODIFYING a specific
element of a specific tag.

something like:

1. look for a given tag eg DIV
2. capture the tag (everything from 'DIV' up to the '')
3. look for a given attribute (eg ID=foo, ID=foo, ID='foo' -- all valid
ways)
4. capture it
5. be given the opportunity to manipulate the attribute's value, delete it,
etc
6. place captured tag (complete with modifed elements) back into the string
in it's original position
7. return to step 1, looking for the next occurence of a DIV tag

If you are only looking for a SPECIFIC tag, you just simplified life
immensely!

?php
  # Get some beautiful sample HTML:
  $html = file('http://php.net/') or die(Could not open php.net);
  $html = implode('', $html);
  
  # Find the DIV tag:
  $div = stristr($html, 'div');
  $divpos = strlen($html) - strlen($div);
  
  # Break the HTML up into before and after DIV tag:
  $before_div = substr($html, 0, $divpos);
  $after_div = substr($html, $divpos);
  
  # Find the *END* of the DIV tag:
  # KNOWN BUG:
  # They *could* bury a  in their attributes if they work at it...
  $end_tag = strstr($after_div, '');
  $endpos = strlen($after_div) - strlen($end_tag);
  $div = substr($after_div, 0, $endpos);
  
  # Now get the after part to *really* be after the *WHOLE* DIV tag:
  $afterdiv = substr($after_div, $endpos);
  
  echo Before DIV tag: BR, htmlentities($before_div), HR\n;
  echo DIV tag itself: BR, htmlentities($div), HR\n;
  echo After DIV tag:  BR, htmlentities($after_div), HR\n;
?

I can pretty much guarantee that I didn't put a +1 or -1 somewhere where it
belongs in the substr() function calls.  I never get that right in my first
pass of coding.  You'll have to fine-tune that part yourself.

But you can now do the same technique to search inside of $div for the ID
attribute, pretty much.

The solution might be a helluva lot more complex, or may be OOP based.


Any inspiration/links/words of wisdom?

If you need to do this for any arbitrary tag all at once, there *HAVE* to be
PHP-based HTML parsers out there in the various PHP script archives...

If all else fails, the PHP source for http://php.net/strip_tags must have
some kind of HTML parsing routine in it.

-- 
Like Music?  http://l-i-e.com/artists.htm


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php