Am 03.04.2010 16:29, schrieb tedd:
Hi gang:
Here's the problem.
I have 184 HTML pages in a directory and each page contain a question.
The question is noted in the HTML DOM like so:
p class=question
Who is Roger Rabbit?
/p
My question is -- how can I extract the string Who is Roger Rabbit?
from each page using php? You see, I want to store the questions in a
database without having to re-type, or cut/paste, each one.
Now, I can extract each question by using javascript --
document.getElementById(question).innerHTML;
-- and stepping through each page, but I don't want to use javascript
for this.
I have not found/created a working example of this using PHP. I tried
using PHP's getElementByID(), but that requires the target file to be
valid xml and the string to be contained within an ID and not a class.
These pages do not support either requirement.
Additionally, I realize that I can load the files and parse out what is
between the p tags, but I was hoping for a GetElementByClass way to
do this.
So, is there one?
Thanks,
tedd
Why don't you just use REGEX? I don't know any possibility to easily
process contents which are not valid XML/XHTML just because there's no
library to load such stuff (but put me in right there).
I'm not an expert of REGEX, but I think the following would do it:
/\p\s*class\=\question\\s*\(.*)\\/p\
(my first contribute here, I beg your pardon if something went wrong)
Regards,
Valentin Dreismann
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php