I have written a script which is very useful for me day-to-day.  It checks
table structure in HTML files.  The script is working, but I would
appreciate any comments, especially as to how this can be better written.

Thank you,
Shawn

Code follows:
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
#!/usr/bin/perl

#This script looks for bad table
#structure in HTML pages.  When using
#server-side scripting, do not run this
#on the source page -- load the page in
#a browser, view source, and save the source
#as a file.

#This script does not take into account client-side
#scripting on the page.  For example, if a Javascript
#function writes out table elements in a loop, this
#script will report an error.  This is just a helpful
#tool for the web programmer who already knows what
#they are doing.

#Created by Shawn Milochik, Oct 31 2002.  Released
#free of any restrictions, and without any warranty.

#execution:  checktables.pl filename.html

open (IN, "<" . $ARGV[0]) || die "Could not open input file.\n";

$line = 0;
$last = undef;

#Open each line of the file.
while (<IN>){
    #Keep track of the line number, so we
    #can tell the user which line of the HTML has
    #the problem.
    $line++;
    #Split the line on the > character.
    @data = split />/, $_;
    while (@data){
        #Take the portion of the line containing an HTML tag
        chomp($test = lc(shift(@data)));

        if (($test =~ /<t/) || ($test =~ /<\/t/)){
            #print "Tag found in $test.\n";

            $curr = undef;
            if ($test =~ "<table"){ $curr = "<table";}
            if ($test =~ "<\/table"){ $curr = "<\/table";}
            if ($test =~ "<tr"){ $curr = "<tr";}
            if ($test =~ "<\/tr"){ $curr = "<\/tr";}
            if ($test =~ "<td"){ $curr = "<td";}
            if ($test =~ "<\/td"){ $curr = "<\/td";}


            #if we found a valid table tag
            if ($curr){
                #If this is not the first
                #iteration of this block
                if ($last){
                    if ($last eq "<table"){
                        if ($curr ne "<tr"){ print "Line $line: found $curr
instead of <tr after <table from line $lastline.\n";}
                    }

                    if ($last eq "<\/table"){
                        if (($curr ne "<table") && ($curr ne "<\/td")){
print "Line $line: found $curr instead of <\/td or <table after <\/table
from line $lastline.\n";}
                    }

                    if ($last eq "<tr"){
                        if ($curr ne "<td"){ print "Line $line: found $curr
instead of <td after $last from line $lastline.\n";}
                    }

                    if ($last eq "<\/tr"){
                        if (($curr ne "<\/table") && ($curr ne "<tr")){
print "Line $line: found $curr instead of <tr or <\/table after $last from
line $lastline.\n";}
                    }

                    if ($last eq "<td"){
                        if (($curr ne "<table") && ($curr ne "<\/td")){
print "Line $line: found $curr instead of <table or <\/td after $last from
line $lastline.\n";}
                    }

                    if ($last eq "<\/td"){
                        if (($curr ne "<td") && ($curr ne "<\/tr")){ print
"Line $line: found $curr instead of <td or <\/tr after $last from line
$lastline.\n";}
                    }

                    $last = $curr;
                    $lastline = $line;
                }else{
                    #First iteration, initialize
                    #$last
                    $last = $curr;
                }#close curly brace for if ($last) block
            }#close curly brace for if ($curr) block


        }#close curly brace for if (($test =~ /<t/) || ($test =~ /<\/t/))
block



    }#close curly brace for while(@data) block
}#close curly brace for while(<IN>) block

print "Check complete.\n";

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
End of Code





**********************************************************************
This e-mail and any files transmitted with it may contain 
confidential information and is intended solely for use by 
the individual to whom it is addressed.  If you received
this e-mail in error, please notify the sender, do not 
disclose its contents to others and delete it from your 
system.

**********************************************************************


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to