On Sat, Nov 29, 2008 at 20:50, Canol Gökel <[EMAIL PROTECTED]> wrote: > Chas. Owens <chas.owens <at> gmail.com> writes: > >> >> On Sat, Nov 29, 2008 at 20:02, Canol Gökel <canol <at> canol.info> wrote: > >> Can't be done*. You need a parser. > > snip > > It can be done. snip
Note that asterisk. snip >> Learn to write a parser instead. This is like asking how to build a >> bridge for people out of toothpicks. It can be done (but I wouldn't >> walk on it), but it is a waste of your time. > > snip > > I don't want to parse a language, I want to match just "a tag". Do you thing, > for example, forum scripts do language parsing for their bbcode? snip Yes, if they are any good. I think you are over estimating the effort of writing/using a parser. If you had any interest in Perl as the language I would point you to Parse::RecDescent, but I am sure what ever language you are using will have something similar available to it. If you are trying to deal with an existing format I would strongly advise you to use an existing parser. snip >> And yet you are asking a Perl list. Perhaps you would be better >> served by asking this question on a list for whatever language you are >> using? Perl's regexes are not standard. They contain many extensions >> that make this sort of thing easier, but since you are using some >> mystery language for this mystery project we cannot help you. >> > > snip > > I could ask this question to any language list which has regular expressions > as > a feature. I guess parsing bbcode can be done with almost any programming > language out there and I think the logic should not be that complicated which > requires modules/extensions etc. right? Since Perl is the most powerful > language > with RegExps I decided to ask this in this list. Also, should I ask what > "variables" are to Pascal list just because I use Pascal? The most power comes from using Perl's variant of regexes. Since other languages don't implement those extensions, and you are planning on using a language other than Perl, you are wasting your time here. And yes, asking about what variables are on a different languages list is foolish. For instance, Perl's variables are dynamic and loosely typed, whereas Pascals are static and strongly typed. If you asked here about variables and then tried to use what you were told in Pascal you would wonder why none of your code even compiled. snip >> * Not 100% true, but the amount of effort you will put into trying to >> get regex that matches even 80% of valid, expected HTML dwarfs the >> amount of time it takes to write a parser. Given the fact that high >> quality parsers exist already, there is no reason to waste your time. >> > > Who said that I want to parse the whole HTML standard? I just want to match 1 > single HTML tag with no attributes or something. Just <p></p> and that's all. Sadly, this is what many people think. It isn't that easy. What if the code looks like this <p><img src="end_p_tag.jpg" alt="</p>"></p> Now you have to understand image tags and paragraph tags, there are countless other examples. The cavalier "I'll just through some regex at it" attitude is why we have things like XSS and injection attacks. snip > Please before answering a post with judgments inside, read it carefully and > try > to understand what this person is asking. snip I understand what you are asking for and I am trying to tell you that you are looking in the wrong place (both on this list and in regard to using a regex for this problem). -- Chas. Owens wonkden.net The most important skill a programmer can have is the ability to read.