On Sat, Nov 29, 2008 at 20:50, Canol Gökel <[EMAIL PROTECTED]> wrote:
> Chas. Owens <chas.owens <at> gmail.com> writes:
>
>>
>> On Sat, Nov 29, 2008 at 20:02, Canol Gökel <canol <at> canol.info> wrote:
>
>> Can't be done*.  You need a parser.
>
> snip
>
> It can be done.
snip

Note that asterisk.

snip
>> Learn to write a parser instead.  This is like asking how to build a
>> bridge for people out of toothpicks.  It can be done (but I wouldn't
>> walk on it), but it is a waste of your time.
>
> snip
>
> I don't want to parse a language, I want to match just "a tag". Do you thing,
> for example, forum scripts do language parsing for their bbcode?
snip

Yes, if they are any good.  I think you are over estimating the effort
of writing/using a parser.  If you had any interest in Perl as the
language I would point you to Parse::RecDescent, but I am sure what
ever language you are using will have something similar available to
it.  If you are trying to deal with an existing format I would
strongly advise you to use an existing parser.

snip
>> And yet you are asking a Perl list.  Perhaps you would be better
>> served by asking this question on a list for whatever language you are
>> using?  Perl's regexes are not standard.  They contain many extensions
>> that make this sort of thing easier, but since you are using some
>> mystery language for this mystery project we cannot help you.
>>
>
> snip
>
> I could ask this question to any language list which has regular expressions 
> as
> a feature. I guess parsing bbcode can be done with almost any programming
> language out there and I think the logic should not be that complicated which
> requires modules/extensions etc. right? Since Perl is the most powerful 
> language
> with RegExps I decided to ask this in this list. Also, should I ask what
> "variables" are to Pascal list just because I use Pascal?

The most power comes from using Perl's variant of regexes.  Since
other languages don't implement those extensions, and you are planning
on using a language other than Perl, you are wasting your time here.
And yes, asking about what variables are on a different languages list
is foolish.  For instance, Perl's variables are dynamic and loosely
typed, whereas Pascals are static and strongly typed.  If you asked
here about variables and then tried to use what you were told in
Pascal you would wonder why none of your code even compiled.

snip
>> * Not 100% true, but the amount of effort you will put into trying to
>> get regex that matches even 80% of valid, expected HTML dwarfs the
>> amount of time it takes to write a parser.  Given the fact that high
>> quality parsers exist already, there is no reason to waste your time.
>>
>
> Who said that I want to parse the whole HTML standard? I just want to match 1
> single HTML tag with no attributes or something. Just <p></p> and that's all.

Sadly, this is what many people think.  It isn't that easy.  What if
the code looks like this

<p><img src="end_p_tag.jpg" alt="</p>"></p>

Now you have to understand image tags and paragraph tags, there are
countless other examples.  The cavalier "I'll just through some regex
at it" attitude is why we have things like XSS and injection attacks.

snip
> Please before answering a post with judgments inside, read it carefully and 
> try
> to understand what this person is asking.
snip

I understand what you are asking for and I am trying to tell you that
you are looking in the wrong place (both on this list and in regard to
using a regex for this problem).

-- 
Chas. Owens
wonkden.net
The most important skill a programmer can have is the ability to read.

Reply via email to