ID:               20918
 Updated by:       [EMAIL PROTECTED]
 Reported By:      [EMAIL PROTECTED]
-Status:           Open
+Status:           Feedback
 Bug Type:         Output Control
 Operating System: win32
 PHP Version:      4.3.0RC2
 New Comment:

please provide a *short* example showing the problem,
and add expected and actual output ...


Previous Comments:
------------------------------------------------------------------------

[2002-12-10 03:01:25] [EMAIL PROTECTED]

hello 
i wrote a script, that reads a few htmlpages with the function
fgetss(), who strips away the html code. this works propperly in the
php-version (php.4.2) 

but with php4.3 something goes wrong and i cant get no output from the
function fgetss(). here are the code.
i hope this is a serious problem and it was helpful for you to report
this... greets timon

<?php
// Liest aus den HTML -Dateien den Text aus
// und strukturiert ihn für die Datenbank.

$fh = opendir("html/") or die("cant read from ./html");
$x = 0;
while($file = readdir($fh)) {
        if($file == '.' || $file == '..') continue;
        $files[$x++] = $file;
        }       
closedir($fh);

echo sizeof($files)." HTML-Dateien ausgelesen...\n";

sort($files);

foreach($files as $file) {
        $fh = fopen("html/$file",'r');
        while($line = fgetss($fh,filesize("html/$file"))) {
                $raw_txt .= zeileputzen($line);
                }
        fclose($fh);
        }
        
echo sizeof($files)." Dateien wurden geparst...\n";

$fh = fopen('raw.txt','w+');
fputs($fh,$raw_txt);
fclose($fh);

echo "...und in die Datei <a href=\"raw.txt\">raw.txt</a>
geschrieben...\n";

function zeileputzen($zeile) {
        // Tabulatoren, &nbsp;, Linktext usw raus...
        $zeile = str_replace("\t",'',$zeile);
        $zeile = str_replace("nach oben",'',$zeile);
        $zeile = str_replace("&nbsp;",'',$zeile);
        $zeile = preg_replace("/^ */",'',$zeile);
        
        if(preg_match("/^LvH-Umfeld/",$zeile)) {$zeile = '';}
        if(preg_match("/^Umfeld/",$zeile)) {$zeile = '';}
        if(preg_match("/^Personen/",$zeile)) {$zeile = '';}
        if(preg_match("/^- \w/",$zeile)) {$zeile = '';}
        $zeile = preg_replace("/^\W ?/","",$zeile);
        $zeile = preg_replace("/(B: )/","\n@B: ",$zeile);
        $zeile = preg_replace("/(Br: )/","\n@Br: ",$zeile);
        $zeile = preg_replace("/(K: )/","\n@K: ",$zeile);
        
        $zeile = preg_replace("/(B: )(\n)/",'B: ',$zeile);
        $zeile = preg_replace("/(Br: )(\n)/",'Br: ',$zeile);
        $zeile = preg_replace("/(K: )(\n)/",'K: ',$zeile);
        
        return $zeile;
        }
        


$fh = fopen("raw.txt",'r') or die("unable to read from raw.txt");
$raw_txt = fread($fh,filesize('raw.txt'));
fclose($fh);

echo "raw.txt ausgelesen\n";

$pieces = explode(chr(10),$raw_txt);
$f = 0;
foreach($pieces as $lines) {
        if(strlen($lines) == 0) { $f++; }
        if(strlen($lines) > 2) { $f = 0; }
        if($f > 10) { 
                $new_buffer .= '###';
                $f = 0;
                }
        echo " :: line -> $lines\n";
        $new_buffer .= $lines."\n";
        }

$f_pieces = explode('###',$new_buffer);
        
unset($new_buffer);

foreach($f_pieces as $l) {
        echo strlen($l);
        if(preg_match("/[A-Za-z0-9]/",$l)) {
                $f_lines = explode(chr(10),$l);
                $new_buffer .= "***";
                foreach($f_lines as $f) {
                        if(!strlen($f)) { continue; }
                        $f = trim($f);
                        $new_buffer .= trim($f)."\n";
                        }
                }
        }       
        
$fh = fopen('formatted.txt','w+');
fwrite($fh,$new_buffer);
fclose($fh);

echo "<a href=\"formatted.txt\">formatted.txt</a> geschrieben!\n";
echo "<a href=\"formattedtxt2mysql.php\">text in db
eintragen...</a>\n";
?>

------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=20918&edit=1

Reply via email to