* and then Jay Blanchard declared....
> Nope, not enough info to spot the problem. I could point you in a

Spare me the sarcasm. Here's the code if anyone can help, thanks.

<?
$xml=file('http://www.weblogs.com/changes.xml');
$xml=implode("\n", $xml);

// Pattern match string
$urlpattern = 
'/((http|https|ftp):\/\/|www)[a-z0-9\-\._]+\/?[a-z0-9_\.\-\?\+\/~=&#;,]*[a-z0-9\/]{1}/si';



$p=xml_parser_create();
xml_parse_into_struct($p, $xml, $vals, $index);
xml_parser_free($p);

print("<h1>Starting</h1>");
print("<ol>");
foreach($vals as $key => $val) {
    /* if its the correct tag, do stuff... */
    if($val['tag']=="WEBLOG" && $val['attributes']['WHEN']<120) {

        /* print the url so I can keep track */
        print("<li>".$val['attributes']['URL']."</li>\n");

        /* Changed this from file() to fopen() to see... */
        $pagehandle=fopen($val['attributes']['URL'],r);
        $page=fread($pagehandle,30000);
        fclose($pagehandle);

        preg_match_all($urlpattern, $page, $matches);
        unset($page);

        foreach($matches[0] as $pageurl) {
            $parsedurl=parse_url($pageurl);
            if($parsedurl['host']=='amazon') {
                $urlfile=fopen('urls.txt',"a");
                fwrite($urlfile,$pageurl."\n");
                fclose($urlfile);
            }
        }
        unset($matches);
        unset($pageurl);
        unset($parsedurl);
    }
}
print("</ol>");
print("<h1>Done!</h1>");

?>

-- 
Nick W

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to