I have written a script which is very useful for me day-to-day. It checks table structure in HTML files. The script is working, but I would appreciate any comments, especially as to how this can be better written.
Thank you, Shawn Code follows: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: #!/usr/bin/perl #This script looks for bad table #structure in HTML pages. When using #server-side scripting, do not run this #on the source page -- load the page in #a browser, view source, and save the source #as a file. #This script does not take into account client-side #scripting on the page. For example, if a Javascript #function writes out table elements in a loop, this #script will report an error. This is just a helpful #tool for the web programmer who already knows what #they are doing. #Created by Shawn Milochik, Oct 31 2002. Released #free of any restrictions, and without any warranty. #execution: checktables.pl filename.html open (IN, "<" . $ARGV[0]) || die "Could not open input file.\n"; $line = 0; $last = undef; #Open each line of the file. while (<IN>){ #Keep track of the line number, so we #can tell the user which line of the HTML has #the problem. $line++; #Split the line on the > character. @data = split />/, $_; while (@data){ #Take the portion of the line containing an HTML tag chomp($test = lc(shift(@data))); if (($test =~ /<t/) || ($test =~ /<\/t/)){ #print "Tag found in $test.\n"; $curr = undef; if ($test =~ "<table"){ $curr = "<table";} if ($test =~ "<\/table"){ $curr = "<\/table";} if ($test =~ "<tr"){ $curr = "<tr";} if ($test =~ "<\/tr"){ $curr = "<\/tr";} if ($test =~ "<td"){ $curr = "<td";} if ($test =~ "<\/td"){ $curr = "<\/td";} #if we found a valid table tag if ($curr){ #If this is not the first #iteration of this block if ($last){ if ($last eq "<table"){ if ($curr ne "<tr"){ print "Line $line: found $curr instead of <tr after <table from line $lastline.\n";} } if ($last eq "<\/table"){ if (($curr ne "<table") && ($curr ne "<\/td")){ print "Line $line: found $curr instead of <\/td or <table after <\/table from line $lastline.\n";} } if ($last eq "<tr"){ if ($curr ne "<td"){ print "Line $line: found $curr instead of <td after $last from line $lastline.\n";} } if ($last eq "<\/tr"){ if (($curr ne "<\/table") && ($curr ne "<tr")){ print "Line $line: found $curr instead of <tr or <\/table after $last from line $lastline.\n";} } if ($last eq "<td"){ if (($curr ne "<table") && ($curr ne "<\/td")){ print "Line $line: found $curr instead of <table or <\/td after $last from line $lastline.\n";} } if ($last eq "<\/td"){ if (($curr ne "<td") && ($curr ne "<\/tr")){ print "Line $line: found $curr instead of <td or <\/tr after $last from line $lastline.\n";} } $last = $curr; $lastline = $line; }else{ #First iteration, initialize #$last $last = $curr; }#close curly brace for if ($last) block }#close curly brace for if ($curr) block }#close curly brace for if (($test =~ /<t/) || ($test =~ /<\/t/)) block }#close curly brace for while(@data) block }#close curly brace for while(<IN>) block print "Check complete.\n"; ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: End of Code ********************************************************************** This e-mail and any files transmitted with it may contain confidential information and is intended solely for use by the individual to whom it is addressed. If you received this e-mail in error, please notify the sender, do not disclose its contents to others and delete it from your system. ********************************************************************** -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]