Steven D'Aprano wrote: > On Tue, 25 Oct 2005 06:30:35 -0700, Iain King wrote: > > > > > Steven D'Aprano wrote: > >> On Tue, 25 Oct 2005 05:17:52 -0700, Iain King wrote: > >> > >> > > >> > Fredrik Lundh wrote: > >> >> Joerg Schuster wrote: > >> >> > >> >> > I just want to use more than 100 capturing groups. > >> >> > >> >> define "more" (101, 200, 1000, 100000, ... ?) > >> >> > >> >> </F> > >> > > >> > The Zero-One-Infinity Rule: > >> > > >> > http://www.catb.org/~esr/jargon/html/Z/Zero-One-Infinity-Rule.html > >> > >> > >> Nice in principle, not always practical. Sometimes the choice is, "do you > >> want it today with arbitrary limits, or six months from now with bugs > >> but no limits?" > >> > >> If assigning arbitrary limits prevents worse problems, well, then go for > >> the limit. For instance, anyone who has fought browser pops ups ("close > >> one window, and ten more open") may have wished that the browser > >> implemented an arbitrary limit of, say, ten pop ups. Or even zero :-) > >> > > > > Well, exactly. Why limit to ten? The user is either going to want to > > see pop-ups, or not. So either limit to 0, or to infinity (and indeed, > > this is what most browsers do). > > I haven't been troubled by exponentially increasing numbers of pop up > windows for a long, long time. But consider your question "why limit to > ten?" in a wider context. > > Elevators always have a weight limit: the lift will operate up to N > kilograms, and stop at N+1. This limit is, in a sense, quite arbitrary, > since that value of N is well below the breaking point of the elevator > cables. Lift engineers, I'm told, use a safety factor of 10 (if the > cable will carry X kg without breaking, set N = X/10). This safety > factor is obviously arbitrary: a more cautious engineer might use a > factor of 100, or even 1000, while another might choose a factor of 5 or > 2 or even 1.1. If engineers followed your advice, they would build lifts > that either carried nothing at all, or accepted as much weight until the > cable stretched and snapped. > > Perhaps computer programmers would have fewer buffer overflow security > exploits if they took a leaf out of engineers' book and built in a few > more arbitrary safety factors into their data-handling routines. We can > argue whether 256 bytes is long enough for a URL or not, but I think we > can all agree that 3 MB for a URL is more than any person needs. > > When you are creating an iterative solution to a problem, the ending > condition is not always well-specified. Trig functions such as sine and > cosine are an easy case: although they theoretically require an infinite > number of terms to generate an exact answer, the terms will eventually > underflow to zero allowing us to stop the calculation. > > But unfortunately that isn't the case for all mathematical calculations. > Often, the terms of our sequence do not converge to zero, due to round-off > error. Our answer cycles backwards and forwards between two or more > floating point approximations, e.g. 1.276805 <-> 1.276804. The developer > must make an arbitrary choice to stop after N iterations, if the answer > has not converged. Zero iterations is clearly pointless. One is useless. > And infinite iterations will simply never return an answer. So an > arbitrary choice for N is the only sensible way out. > > In a database, we might like to associate (say) multiple phone numbers > with a single account. Most good databases will allow you to do that, but > there is still the question of how to collect that information: you have > to provide some sort of user interface. Now, perhaps you are willing to > build some sort of web-based front-end that allows the user to add new > fields, put their phone number in the new field, with no limit. But > perhaps you are also collecting data using paper forms. So you make an > arbitrary choice: leave two (or three, or ten) boxes for phone numbers. > > There are many other reasons why you might decide rationally to impose an > arbitrary limit on some process -- arbitrary does not necessarily mean > "for no good reason". Just make sure that the reason is a good one. > > > -- > Steven.
I think we are arguing at cross-purposes, mainly because the term' arbitrary' has snuck in. The actual rule: "Allow none of foo, one of foo, or any number of foo." A rule of thumb for software design, which instructs one to not place random limits on the number of instances of a given entity. Firstly, 'for software design'. Not for field engineers servicing elevators :) Second, it's [random], not [arbitrary]. I took your use of arbitrary to mean much the same thing - a number picked without any real judgement involved, simply because it was deemed larger than some assumed maximum size. The rule does not apply to a number selected for good reason. I don't think I get your phone record example: Surely you'd have the client record in a one-to-many relationship with the phone number records, so there would be (theoretically) no limit? Your web interface rang a bell though - in GMails contacts info page, each contact has info stored in sections. Each of these sections stores a heading, an address, and some fields. It defaults to two fields, with an add field button Hitting it a lot I found this maxed out at 20 fields per section. You can also add more sections though - I got bored hitting the add section button once I got to 51 sections with the button still active. I assume there is some limit to the number of sections, but I don't know what it is :) GMail is awesome. Anyway, back to the OP: in this specific case, the cap of 100 groups in a RE seems random to me, so I think the rule applies. Also, see "C Programmer's Disease": http://www.catb.org/~esr/jargon/html/C/C-Programmers-Disease.html Iain -- http://mail.python.org/mailman/listinfo/python-list