Thanks David...so it's looking at each character, storing it in /1, then comparing the "next" character with what is in /1. I guess the escape character (which is not needed in, say, Notepad++) threw me a bit.
On Sun, Jan 24, 2016 at 2:32 AM, David Rowley <david.row...@2ndquadrant.com> wrote: > On 24 January 2016 at 12:44, Govind Chettiar <rasha...@gmail.com> wrote: > > I have a simple table consisting of a bunch of English words. I am > trying > > to find words that have repeated characters in them, for example > > apple > > tattoo > > > > but not > > > > orange > > lemon > > > > I know that only a maximum of one repetition can occur > > > > I tried various options like > > SELECT word FROM public."SpellItWords" > > WHERE word ~ E'(.)\1{2,}' > > > > SELECT word FROM public."SpellItWords" > > WHERE word ~ E'([a-z])\1{2}' > > > > What finally worked was this > > SELECT word FROM public."SpellItWords" > > WHERE word ~ E'(.)\\1' > > > > But I don't really understand what this does...Can you explain? > > The ~ operator is a regular expression matching operator, and the > (.)\1 is a regular expression. More details here > http://www.postgresql.org/docs/current/static/functions-matching.html > > The regular expression . matches a single character, since that . is > wrapped in () the regex engine captures the match and stores it in a > variable, this is called a capture group. Since this is the first such > capture group in the regular expression, then the value matching the . > gets stored in the variable \1, so your regex basically says; "match a > single character which has the same single character to its immediate > right hand side". The extra \ is just an escape character. > > -- > David Rowley http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >