[GENERAL] substring and POSIX re's

2005-04-19 Thread Don Isgitt
Hi.
First: PG version 7.4 and 8.0.
I have a question regarding the following simplified query:
gds2=# select substring('NE NE SE 2310 FSL 330 FEL' from '^([A-Z][A-Z] )+');
substring
---
SE
(1 row)
The pg docs say that this form of substring uses POSIX re's, and my 
understanding of POSIX re's is they are always greedy. So, why do I get 
only SE instead of NE NE SE? Pilot error, probably, but would someone 
please enlighten me? Thank you very much.

Don
p.s. The target string can have from 1 to 6 of the 2 char strings, not 
just 3 as shown in this example.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] substring and POSIX re's

2005-04-19 Thread Tom Lane
Don Isgitt [EMAIL PROTECTED] writes:
 gds2=# select substring('NE NE SE 2310 FSL 330 FEL' from '^([A-Z][A-Z] )+');
  substring
 ---
  SE
 (1 row)

 The pg docs say that this form of substring uses POSIX re's, and my 
 understanding of POSIX re's is they are always greedy. So, why do I get 
 only SE instead of NE NE SE? Pilot error, probably, but would someone 
 please enlighten me? Thank you very much.

I think you want

regression=# select substring('NE NE SE 2310 FSL 330 FEL' from '^(([A-Z][A-Z] 
)+)');
 substring
---
 NE NE SE
(1 row)

ie, you need the + to be *inside* the capturing parentheses.  When
it's outside, I guess the engine chooses to consider the last match
of the parenthesized subexpression as the thing to return.  (I can't
recall if this choice is specified in the docs or not.)

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] substring and POSIX re's

2005-04-19 Thread Don Isgitt

Tom Lane wrote:
Don Isgitt [EMAIL PROTECTED] writes:
 

gds2=# select substring('NE NE SE 2310 FSL 330 FEL' from '^([A-Z][A-Z] )+');
substring
---
SE
(1 row)
   

 

The pg docs say that this form of substring uses POSIX re's, and my 
understanding of POSIX re's is they are always greedy. So, why do I get 
only SE instead of NE NE SE? Pilot error, probably, but would someone 
please enlighten me? Thank you very much.
   

I think you want
regression=# select substring('NE NE SE 2310 FSL 330 FEL' from '^(([A-Z][A-Z] 
)+)');
substring
---
NE NE SE
(1 row)
ie, you need the + to be *inside* the capturing parentheses.  When
it's outside, I guess the engine chooses to consider the last match
of the parenthesized subexpression as the thing to return.  (I can't
recall if this choice is specified in the docs or not.)
regards, tom lane
 

Thanks, Tom. Interestingly enough, neither my original query or your 
corrected one returns anything with pg 7.4--another good reason to 
upgrade to 8.*

Don
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] substring and POSIX re's

2005-04-19 Thread Michael Fuhr
On Tue, Apr 19, 2005 at 11:17:46AM -0500, Don Isgitt wrote:

 Thanks, Tom. Interestingly enough, neither my original query or your 
 corrected one returns anything with pg 7.4--another good reason to 
 upgrade to 8.*

Hmmm...for me both queries give the results shown if I run them in
7.3.9, 7.4.7, 8.0.2, or 8.1devel.  What's different about your 7.4
installation?

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [GENERAL] substring and POSIX re's

2005-04-19 Thread Alvaro Herrera
On Tue, Apr 19, 2005 at 10:03:45AM -0600, Michael Fuhr wrote:
 On Tue, Apr 19, 2005 at 11:17:46AM -0500, Don Isgitt wrote:
 
  Thanks, Tom. Interestingly enough, neither my original query or your 
  corrected one returns anything with pg 7.4--another good reason to 
  upgrade to 8.*
 
 Hmmm...for me both queries give the results shown if I run them in
 7.3.9, 7.4.7, 8.0.2, or 8.1devel.  What's different about your 7.4
 installation?

Maybe the regex_flavor setting?  (not sure of the exact name)

-- 
Alvaro Herrera ([EMAIL PROTECTED])
Maybe there's lots of data loss but the records of data loss are also lost.
(Lincoln Yeoh)

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] substring and POSIX re's

2005-04-19 Thread Michael Fuhr
On Tue, Apr 19, 2005 at 12:39:52PM -0400, Alvaro Herrera wrote:
 On Tue, Apr 19, 2005 at 10:03:45AM -0600, Michael Fuhr wrote:
  On Tue, Apr 19, 2005 at 11:17:46AM -0500, Don Isgitt wrote:
  
   Thanks, Tom. Interestingly enough, neither my original query or your 
   corrected one returns anything with pg 7.4--another good reason to 
   upgrade to 8.*
  
  Hmmm...for me both queries give the results shown if I run them in
  7.3.9, 7.4.7, 8.0.2, or 8.1devel.  What's different about your 7.4
  installation?
 
 Maybe the regex_flavor setting?  (not sure of the exact name)

Ah yes, I forgot about that

test= SET regex_flavor TO basic;
SET
test= SELECT substring('NE NE SE 2310 FSL 330 FEL' FROM '^(([A-Z][A-Z] )+)');
 substring 
---
 
(1 row)

test= SET regex_flavor TO advanced;
SET
test= SELECT substring('NE NE SE 2310 FSL 330 FEL' FROM '^(([A-Z][A-Z] )+)');
 substring 
---
 NE NE SE 
(1 row)

test= SELECT version();
  version  
---
 PostgreSQL 7.4.7 on sparc-sun-solaris2.9, compiled by GCC gcc (GCC) 3.4.2
(1 row)

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [GENERAL] substring and POSIX re's

2005-04-19 Thread Don Isgitt

Michael Fuhr wrote:
On Tue, Apr 19, 2005 at 12:39:52PM -0400, Alvaro Herrera wrote:
 

On Tue, Apr 19, 2005 at 10:03:45AM -0600, Michael Fuhr wrote:
   

On Tue, Apr 19, 2005 at 11:17:46AM -0500, Don Isgitt wrote:
 

Thanks, Tom. Interestingly enough, neither my original query or your 
corrected one returns anything with pg 7.4--another good reason to 
upgrade to 8.*
   

Hmmm...for me both queries give the results shown if I run them in
7.3.9, 7.4.7, 8.0.2, or 8.1devel.  What's different about your 7.4
installation?
 

Maybe the regex_flavor setting?  (not sure of the exact name)
   

Ah yes, I forgot about that
test= SET regex_flavor TO basic;
SET
test= SELECT substring('NE NE SE 2310 FSL 330 FEL' FROM '^(([A-Z][A-Z] )+)');
substring 
---

(1 row)
test= SET regex_flavor TO advanced;
SET
test= SELECT substring('NE NE SE 2310 FSL 330 FEL' FROM '^(([A-Z][A-Z] )+)');
substring 
---
NE NE SE 
(1 row)

test= SELECT version();
 version  
---
PostgreSQL 7.4.7 on sparc-sun-solaris2.9, compiled by GCC gcc (GCC) 3.4.2
(1 row)

 

Thank you, Alvaro and Michael,
The regex_flavor setting was the culprit; I never knew of such a 
creature in pg! Mystery solved. The members on this board are great.

Don
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])