Re: Trying to block out BGCOLOR

2003-03-25 Thread Todd W

[EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 I'm making something, and need to block out BGCOLOR attribute. The problem
 is, the BGCOLOR could be with or without quotation marks. This is the code
I
 used:

 $article =~ s/ bgcolor=(?)(.*?)(?)//gi


Here is how I would do it, using SAX with a helper module called
XML::SAX::Machines:

package Skip::BGCOLOR;

use strict;
use XML::SAX::Base;
use vars qw/@ISA/;
@ISA = qw/XML::SAX::Base/;

sub start_element {
  my($self, $el) = @_;

  for my $property ( keys %{ $el-{Attributes} } ) {
if ($property =~ /BGCOLOR$/i ) {
  delete( ${ $el-{Attributes} }{$property} );
}
  }

  $self-SUPER::start_element( $el ); # forward the element downstream
}

sub xml_decl { }
1;

package main;
use strict;

use XML::SAX::Machines qw/:all/;

my($pipeline) = Pipeline(
'Skip::BGCOLOR' =
\*STDOUT
);

$pipeline-parse_string( join('', DATA) );

print(\n);

__DATA__
html
  head
titleNo BGCOLORs/title
  /head
  body bgcolor=red
h1 bgcolor=whiteNo BGCOLORs/h1
hr width=75% /
div BGCOLOR=blueNo BGCOLORs/div
  /body
/html

Much cleaner and it guarantees to not fudge up your markup.

Todd W.



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Trying to block out BGCOLOR

2003-03-21 Thread Li Ngok Lam
What about the user says :

body
text=#123456
bgcolor=#aabbcc

or
body bgcolor='#123456'
or
body bgcolor=
red

Anyway, the bgcolor can be formed or change again via javascript or CSS.
I mean, blocking bgcolor in body tag cannot solve your potential problem.

But you may find someway to put this in your body tag :
background=white_block.jpg,
as wallpaper goes upper than bgcolor or using javascript :
document.bgColor='ff'; // not sure if this run on NS too

In  Perl way, I can't provide any code here because I don't know when you
want to
block that bgcolor .. On the print time ? or at the html file's landing
time...

Anyway, if you just don't want your users to use bgcolor in the body tag,
just simply $line =~ s/bgcolor/whatever_you_like/;

Once the browser don't understand something not in list of its properties,
will be ignored... I mean, don't care on the RHS of =, but the LHS, unless,
you are trying to fulfill W3C's html standard.

Regards,
Perl Beginner

 no, the problem is on the other side of the = token

 eg:
 body bgcolor=#99
 or
 body bgcolor=red
 or
 body bgcolor=red

 and he would like to make that

 body


 I would of course go with say:

 #
 #
 sub un_colour {
 my ($line) = @_;

 $line =~ s/\s*bgcolor=(?)([^\s]+)(?)//gi ;

 $line;
  } # end of un_colour


 since the middle element needs to guard against

 a. 
 b. 
 c. white space

 ciao
 drieux

 ---


 my $l1 = 'bodybgcolor=#99 other=fred
 stuff here
 table bgcolor=blue
 ';
 my $l2 = 'body bgcolor=red other=fred';
 my $l3 = 'body bgcolor=red other=fred';

 foreach my $tag ( $l1 , $l2 , $l3 )
 {

 my $answer =  un_colour($tag);

 print #---\n$answer\nfor $tag \n;
 }

 #
 #
 sub un_colour {
 my ($line) = @_;

 $line =~ s/\s*bgcolor=(?)([^\s]+)(?)//gi ;

 $line;

 } # end of un_colour


 --
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Trying to block out BGCOLOR

2003-03-21 Thread Jimmy George
Hello World

Li Ngok Lam's approach looks good to me. Using the $line=~s// approach
appears to be only removing the bgcolor word correctly but could be
stuck on the different types of colour descriptor used. Is it RGB, hex
or a word?

Putting a background color descriptor in though allows you to change the
image to a white or transparent gif file quite simply. You can still use
the default background where needed.

JimmyG

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Trying to block out BGCOLOR

2003-03-21 Thread drieux
volks,

brief prefix. I believe Li Ngok Lam has found a clear
'issue' in the original request for solving a regex problem.
my working assumption was that the OP needed a filter that
would clean up a bunch of pre-existing static *.html files
because the site had adopted a new scheme, and so these older
pages would merely need to be 'cleaned'
But since some here may also have scratched their heads at the
original request let's step aside for a moment and look at some of the 
issues

On Friday, Mar 21, 2003, at 09:05 US/Pacific, Li Ngok Lam wrote:
[..]
Anyway, the bgcolor can be formed or change again via javascript or 
CSS.
I mean, blocking bgcolor in body tag cannot solve your potential 
problem.
This of course is the 'critical kill' in the OP's problem.
In terms of trying to 'control it all' from some CGI script
that is 'generating' web pages given various 'input streams'.
{ hey, we all started some place. And figured out our better
ways along the way... }
Let's deal with the CSS/SSI side plays first, as the
javascript side is modestly easier to solve.
There are CSS as well as various SSI directives, which,
were we to seek completeness would require that a much
more complex parser be in play, since it would need to
deal with each of them in turn - and DOING the 'resolve
in place' - eg given
head
meta http-equiv=content-type content=text/html;charset=ISO-8859-1
titleWelcome/title
link href=../CSS/sitewide.css rel=stylesheet media=screen
/head
the parser would need to grot through the *.css file and
resolve if there is any bgcolor components, if clean,
let it stay, otherwise that part of the text would need
to be reconstructed and pushed into the data stream:
htmlheadtitle Welcome /title
style
!--
body  { font-family: Arial, Helvetica, Geneva, Swiss, SunSans-Regular }
p  { font-size: 12px; font-family: Arial, Helvetica, Geneva, Swiss, 
SunSans-Regular }
td   { font-size: 12px; font-family: Times New Roman, Georgia, Times }
element { }
//--
/style
/head

We of course would not need to put the static 'content-type'
in a dynamic stream back to the web browswer, since as a
perl CGI script, we of course need to send out the
	print Content-Type: text/html;charset=ISO-8859-1 $CR$LF

anyway, right???

But you may find someway to put this in your body tag :
background=white_block.jpg,
while we are proposing the idea of replacing, it is
important to remember that the 'background' attribute is
'acceptable' in more than just a body tag... But you
probably would not want to ship a src such as a jpg file
in the process if all you really want to do is redefine
to say white eg:
	bgcolor=#ff

the RegEx I proposed would of course remove the string

	background=white_block.jpg

from any 'input' provided since it really does not
care about whether those are alpha-numeric, or not,
since it was designed to remove the stuff after the =
as it were...
as wallpaper goes upper than bgcolor or using javascript :
document.bgColor='ff'; // not sure if this run on NS too
[..]

this part of the problem is where one needs to expand the
RegEx as well, so that one deals with the possible contamination
in a javascript element, most likely triggered by the 'onload'...
But the 'patterns'

document.bgColor
document.background
etc, could likewise be 'targetted' for conversion, on the
fly, and/or 'in place' with the same type of filtering
with an appropriate RegEx.
The trick in those cases of course is that javascript
allows white space on either side of the = so one is
looking at the problem of
	$line =~ s/document.bgColor\s*=\s*([']?)([^^'\s]+)([']?)\s*(;?)//gi 
;

in this case, since single or double quotes would be possible

HTH.



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Trying to block out BGCOLOR

2003-03-21 Thread WilliamGunther
Just so everyone knows, it was for a print friendly part of a CMS-type 
script. With all your help, it was solved, with a regex. It wasn't just for 
the body tag, It is for EVERY tag, and I blocked the BGCOLOR, BACKGROUND, 
STYLE, CLASS, ID, COLOR, and more attributes to totally make the page both 
dull and print friendly. My problem was with my Regex, which was:
$blah =~ s/ bgcolor=(?)(.*?)(?)//gi 

Shortly after posting, I solved it myself with
$blah =~ s/ bgcolor=(?)(.*?)( |)/$4/gi;

I doubt that would have held up. My new one thanks to drieux is:
$blah =~ s/\s*bgcolor=(?)([^\s]+)(?)//gi ;

Thank you for your help everyone.

William Gunther


RE: Trying to block out BGCOLOR

2003-03-20 Thread Kipp, James

 
 I'm making something, and need to block out BGCOLOR 
 attribute. The problem 
 is, the BGCOLOR could be with or without quotation marks. 
 This is the code I 
 used:
 
 $article =~ s/ bgcolor=(?)(.*?)(?)//gi
 

so you are saying it could be bgcolor or bgcolor ?
how about something simple like:
$article =~ s/bgcolor|\bgcolor\//gi;


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Trying to block out BGCOLOR

2003-03-20 Thread drieux
On Thursday, Mar 20, 2003, at 06:00 US/Pacific, Kipp, James wrote:

I'm making something, and need to block out BGCOLOR
attribute. The problem
is, the BGCOLOR could be with or without quotation marks.
This is the code I
used:
$article =~ s/ bgcolor=(?)(.*?)(?)//gi

so you are saying it could be bgcolor or bgcolor ?
how about something simple like:
$article =~ s/bgcolor|\bgcolor\//gi;
no, the problem is on the other side of the = token

eg:
body bgcolor=#99
or
body bgcolor=red
or
body bgcolor=red
and he would like to make that

	body

I would of course go with say:

#
#
sub un_colour {
my ($line) = @_;

$line =~ s/\s*bgcolor=(?)([^\s]+)(?)//gi ;
$line;
} # end of un_colour
since the middle element needs to guard against

a. 
b. 
c. white space
ciao
drieux
---


my $l1 = 'bodybgcolor=#99 other=fred
stuff here
table bgcolor=blue
';
my $l2 = 'body bgcolor=red other=fred';
my $l3 = 'body bgcolor=red other=fred';

foreach my $tag ( $l1 , $l2 , $l3 )
{

my $answer =  un_colour($tag);

print #---\n$answer\nfor $tag \n;
}

#
#
sub un_colour {
my ($line) = @_;

$line =~ s/\s*bgcolor=(?)([^\s]+)(?)//gi ;

$line;

} # end of un_colour
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Trying to block out BGCOLOR

2003-03-20 Thread WilliamGunther
I'm saying it could be bgcolor=COLOR or bgcolor=COLOR


RE: Trying to block out BGCOLOR

2003-03-20 Thread Kipp, James
I'm saying it could be bgcolor=COLOR or bgcolor=COLOR 

Yes I realize. I believe drieux's solution, or an adaptation of it, is what
you need


I would of course go with say:

#
#
sub un_colour {
my ($line) = @_;

$line =~ s/\s*bgcolor=(?)([^\s]+)(?)//gi ;

$line;
 } # end of un_colour


since the middle element needs to guard against

a. 
b. 
c. white space

ciao
drieux

---


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Trying to block out BGCOLOR

2003-03-20 Thread drieux
On Thursday, Mar 20, 2003, at 11:26 US/Pacific, Kipp, James wrote:

I'm saying it could be bgcolor=COLOR or bgcolor=COLOR
Yes I realize. I believe drieux's solution, or an adaptation of it,
is what you need
note: I do subs because it is easier for me to 'loop on them'
and if they are worth it, they get stuffed in a perl module somewhere...
[..]
#
#
sub un_colour {
my ($line) = @_;

$line =~ s/\s*bgcolor=(?)([^\s]+)(?)//gi ;
$line;
 } # end of un_colour
the usage would be

	my $new_html_text = un_colour($html_text);

Or you could just use the line itself.

If it helps to break out the sequence

s/\s*   #   one or more white space before
  bgcolor=  # the specific text
  (?) # first conditional group - 
  ([^\s]+) # middle group -
  (?) # third conditional group
//gi
since the middle element needs to guard against

a. 
b. 
c. white space
Note that we are looking for at least one or more
characters of the 'class' [^\s] - or is english
not   ::   let the 3rd group grab this
not ::   the end of tag token
not white space ::   the end of attribute delimiter
since we are looking for the set of characters
that are 'not delimiters' - perchance the bass-end-akward
way of making a set
since COLOR in this context is both:

a. the secquence of alpha characters
b. a # preceeded hexit numeric sequence
I figured it would be easier to NOT go with
the more complex regex that would need to note
that 'if preceded by a #, then must be numeric...'
Yeech, way to much work on that side of the trail.
The test case code had to include BOTH the 
and the white space components so that it would
correctly parse not merely the specific cases
we are concerned about - but those cases in
their 'natural enviornment' eg
body bgcolor=red other=fred
body bgcolor=red
body bgcolor=red other=fred
body bgcolor=#FF other=fred
body bgcolor=#ff

remember that bgcolor is an attribute in a tag.

Or allow me to argue the defect in the initial idea

	$line =~ s/ *bgcolor=(?)(.*)(?)//gi ;

the problem is that middle group - the match one
or more of anything... A very GREEDY GRAB - since
it would take say
	body bgcolor=red other=fred

and make that

	bodyfred

since the sequence - with the round braces
delimiting the group matches:
	/ bgcolor=()(red other=)()/

is the most greedy grab possible. Which may have
been what you were noticing in the output.
So the simplest solution appeared to be to
work out the list of things that were 'delimiters'
and then allow anything in the middle group
that was not a delimiter...
HTH...

ciao
drieux
---

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Trying to block out BGCOLOR

2003-03-19 Thread WilliamGunther
I'm making something, and need to block out BGCOLOR attribute. The problem 
is, the BGCOLOR could be with or without quotation marks. This is the code I 
used:

$article =~ s/ bgcolor=(?)(.*?)(?)//gi

It doesn't work to my liking, and was hoping someone else had a better 
solution.

William