On Wednesday, 11 de January de 2012 12.49.27, Giuseppe D'Angelo wrote: > Up? > Nobody wants to discuss this?
It's above the bikeshed threshold :-) First, you'll need to get a Nokian to import the PCRE sources. You cannot submit them to Gerrit (not even to the commit you made) because that violates the CLA. You're not the author. Please don't submit the PCRE code again -- just pretend it's there. As for me, sorry I didn't review. It fell through the cracks of "weekend is coming". The API looks now a lot more digestible. There are still a few methods that I will need the documentation for, as I can't guess what they are from their name ("subject" is probably an RE term that I don't know). The API around the captured texts may need a few more rounds of discussion. The name "cap" appeared in Qt 3 and if we're not able to keep source compatibility with Qt 4 anyway, maybe it's time to fix it too. The iterating methods, which are the cool thing about this API, seem to be lost. I don't see how to get the contents of that match. Specific questions: > * QRegularExpressionMatch::captureCount returns actually the highest > index of a capturing group that matched something. Ideas? > (lastCapturedIndex?) It seems that they are the same thing. captureCount looks fine if the other methods also have "capture" in the name. Does this return the number of named captures too? E.g. imagine I have two named captures in my RE and nothing else. If they match, will that return 2? If my RE has a capture that is optional and fails to match, how do I find out? Imagine: rx = /(foo)?(bar)/ rx =~ "bar" In this case, the first capture failed to match anything. How do I know that in the API? > * What should QRegularExpressionMatch::subjectOffset return when one > advances a match (f.i. by using > QRegularExpressionMatch::operator++)? The offset at which > "logically" the match is re-attempted (which is the ending position > of the current match + 1) or the one at which it is REALLY > attempted, which could be one or two charaters ahead, if the old > match matched an empty string? (Cf. the discussion in my last mail > about attempting /g matches against patterns that can match an empty > string) I don't get the question because I don't know what a subject is, so I don't know what a subject offset is supposed to be. Still, think about the use-case: would someone need this offset? If so, why do they need it? What do they need it for? Hopefully, that will help you answer the question. > * Should endPos(n) return the offset AT the end of the n-th capturing > group, thus enforcing the invariant "matchedLength(n) = endPos(n) - > startPos(n) + 1" and implying that a capturing group of of length 0 > returns endPos(n) = startPos(n) - 1 (which could seem strange on a > first look)? Or do you prefer endPos(n) to return the offset plus > one (i.e. immediately after the end of substring captured by the > n-th group), having then "matchedLength(n) = endPos(n) - > startPos(n)"? (3rd option: remove endPos(n) entirely) How is this even a problem? Under which circumstances is the triad start, length, end not holding? endPos should be one after the last character matched, so that in all circumstances end = start + length This holds for all containers, like QString, QByteArray, QVector, etc. If this is difficult to visualise in the API, remove the "end" methods and keep only start and length. > Does a string exactly match a pattern? > > Version 1 > QString str("a string"); > bool matches = str.contains(QRegularExpression("\\Aa str\\w+\\z")); > // matches == true A non-initiated like me might write "^a str\\w+$". I'd expect that to work and, by default, ^ is the beginning of the string and $ the end. Note I did not set MultilineOption. > for (QRegularExpressionMatch match = re.match(str); > match.hasMatch(); ++match) > substrings << match.cap(); This one mixes STL-style methods (operator++) with Java-style ones. Either we do: for (match = re.match(); match != re.end(); ++match) or we do: match = re.match(); while (match.hasNext()) { /* whatever */ match.next(); } -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center Intel Sweden AB - Registration Number: 556189-6027 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development