[GENERAL] Pattern match against array elements?

2015-10-12 Thread Israel Brewster
Is there any way to do a pattern match against the elements of an array in 
postgresql (9.4 if the version makes a difference)? I have a grouped query 
that, among other things, returns an array of values, like:

SELECT lognum, array_agg(flightnum) as flightnums FROM logs GROUP BY lognum;

Where the flightnum field is a varchar containing either a text string or a 
three-or-four digit number. Now say I want to select all logs that have a 
flight number starting with an '8' (so '800' or '8000' series flights). My 
first thought was to do something like this:

SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs GROUP 
BY lognum) s1 WHERE '8%' like ANY(flightnums);

But while this doesn't give an error, it also doesn't return any results. I'm 
guessing that this is because the wildcard is on the left of the operator, and 
needs to be on the right. Of course, turning it around to be:

WHERE ANY(flightnum) like '8%'

gives me a syntax error. So is there any way I can run this query such that I 
get any rows containing a flight number that starts with an 8 (or whatever)?

---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread dinesh kumar
On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster 
wrote:

> Is there any way to do a pattern match against the elements of an array in
> postgresql (9.4 if the version makes a difference)? I have a grouped query
> that, among other things, returns an array of values, like:
>
> SELECT lognum, array_agg(flightnum) as flightnums FROM logs GROUP BY
> lognum;
>
> Where the flightnum field is a varchar containing either a text string or
> a three-or-four digit number. Now say I want to select all logs that have a
> flight number starting with an '8' (so '800' or '8000' series flights). My
> first thought was to do something like this:
>
> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs
> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
>
> But while this doesn't give an error, it also doesn't return any results.
> I'm guessing that this is because the wildcard is on the left of the
> operator, and needs to be on the right. Of course, turning it around to be:
>
> WHERE ANY(flightnum) like '8%'
>
> gives me a syntax error. So is there any way I can run this query such
> that I get any rows containing a flight number that starts with an 8 (or
> whatever)?
>
>
Are you looking for this ?

SELECT lognum array_agg(flightnum) FROM logs WHERE flightnum ~ '^8' GROUP
BY lognum;




> ---
> Israel Brewster
> Systems Analyst II
> Ravn Alaska
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7293 <#>
> ---
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 

Regards,
Dinesh
manojadinesh.blogspot.com


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Jeff Janes
On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster 
wrote:

> Is there any way to do a pattern match against the elements of an array in
> postgresql (9.4 if the version makes a difference)? I have a grouped query
> that, among other things, returns an array of values, like:
>
> SELECT lognum, array_agg(flightnum) as flightnums FROM logs GROUP BY
> lognum;
>
> Where the flightnum field is a varchar containing either a text string or
> a three-or-four digit number. Now say I want to select all logs that have a
> flight number starting with an '8' (so '800' or '8000' series flights). My
> first thought was to do something like this:
>
> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs
> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
>
> But while this doesn't give an error, it also doesn't return any results.
> I'm guessing that this is because the wildcard is on the left of the
> operator, and needs to be on the right.


Right.  The LIKE operator does not have a commutator by default.  (And if
you created one for it, it could not use an index in this case.)


> Of course, turning it around to be:
>
> WHERE ANY(flightnum) like '8%'
>
> gives me a syntax error. So is there any way I can run this query such
> that I get any rows containing a flight number that starts with an 8 (or
> whatever)?
>

I think you're best bet is to do a subquery against the unaggregated table.

select * from aggregated a where exists
  (select 1 from unaggregated ua where a.lognum=ua.lognum and flightnum
like '8%')


This is a common problem.  If you find a better solution, I'd love to hear
it!

Cheers,

Jeff


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Tom Lane
Jeff Janes  writes:
> On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster 
> wrote:
>> My first thought was to do something like this:
>> 
>> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs
>> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
>> 
>> But while this doesn't give an error, it also doesn't return any results.
>> I'm guessing that this is because the wildcard is on the left of the
>> operator, and needs to be on the right.

> Right.  The LIKE operator does not have a commutator by default.  (And if
> you created one for it, it could not use an index in this case.)

Well, it couldn't use an index anyway, given that the query as written
wants to collect groups if *any* member is LIKE '8%', rather than
restricting the data to such flightnums before aggregation occurs.

Personally I'd suggest building a commutator operator (just need a
one-liner SQL or plpgsql function as infrastructure) and away you go.

> I think you're best bet is to do a subquery against the unaggregated table.

> select * from aggregated a where exists
>   (select 1 from unaggregated ua where a.lognum=ua.lognum and flightnum
> like '8%')

That would work too, but not sure about performance relative to the other
way.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Israel Brewster
In the general case, that might work, however in my actually use case the inner SQL query (and underlying table structure) is rather more complicated, making it so I really want the WHERE clause in an outside query, as in my (non-functional) example. Just to clarify, the actual structure of the query is more like this:SELECT 	FROM	(SELECT(SELECT			array_agg(flightnum)		FROM legdetails		WHERE logid=logs.id) as flightnums --this is where the array comes from that I want to filter on.	FROM logs		) s1WHERE		ORDER BY So the query is noticeably different than the one I original gave, but the end result is the same: an array in an inner query, and trying to filter based on the contents of the array. Sorry if I confused the issue by trying to simplify the concept too much.
---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD


On Oct 12, 2015, at 10:14 AM, dinesh kumar  wrote:On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster  wrote:Is there any way to do a pattern match against the elements of an array in postgresql (9.4 if the version makes a difference)? I have a grouped query that, among other things, returns an array of values, like:

SELECT lognum, array_agg(flightnum) as flightnums FROM logs GROUP BY lognum;

Where the flightnum field is a varchar containing either a text string or a three-or-four digit number. Now say I want to select all logs that have a flight number starting with an '8' (so '800' or '8000' series flights). My first thought was to do something like this:

SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);

But while this doesn't give an error, it also doesn't return any results. I'm guessing that this is because the wildcard is on the left of the operator, and needs to be on the right. Of course, turning it around to be:

WHERE ANY(flightnum) like '8%'

gives me a syntax error. So is there any way I can run this query such that I get any rows containing a flight number that starts with an 8 (or whatever)?
 Are you looking for this ? SELECT lognum array_agg(flightnum) FROM logs WHERE flightnum ~ '^8' GROUP BY lognum;   
---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
-- Regards,Dineshmanojadinesh.blogspot.com



Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Israel Brewster
On Oct 12, 2015, at 10:39 AM, Tom Lane  wrote:
> 
> Jeff Janes  writes:
>> On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster 
>> wrote:
>>> My first thought was to do something like this:
>>> 
>>> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs
>>> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
>>> 
>>> But while this doesn't give an error, it also doesn't return any results.
>>> I'm guessing that this is because the wildcard is on the left of the
>>> operator, and needs to be on the right.
> 
>> Right.  The LIKE operator does not have a commutator by default.  (And if
>> you created one for it, it could not use an index in this case.)
> 
> Well, it couldn't use an index anyway, given that the query as written
> wants to collect groups if *any* member is LIKE '8%', rather than
> restricting the data to such flightnums before aggregation occurs.
> 
> Personally I'd suggest building a commutator operator (just need a
> one-liner SQL or plpgsql function as infrastructure) and away you go.

That could work. I'll look into that.

> 
>> I think you're best bet is to do a subquery against the unaggregated table.
> 
>> select * from aggregated a where exists
>>  (select 1 from unaggregated ua where a.lognum=ua.lognum and flightnum
>> like '8%')
> 
> That would work too, but not sure about performance relative to the other
> way.
> 
>   regards, tom lane


---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Jeff Janes
On Mon, Oct 12, 2015 at 11:39 AM, Tom Lane  wrote:

> Jeff Janes  writes:
> > On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster  >
> > wrote:
> >> My first thought was to do something like this:
> >>
> >> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM
> logs
> >> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
> >>
> >> But while this doesn't give an error, it also doesn't return any
> results.
> >> I'm guessing that this is because the wildcard is on the left of the
> >> operator, and needs to be on the right.
>
> > Right.  The LIKE operator does not have a commutator by default.  (And if
> > you created one for it, it could not use an index in this case.)
>
> Well, it couldn't use an index anyway, given that the query as written
> wants to collect groups if *any* member is LIKE '8%', rather than
> restricting the data to such flightnums before aggregation occurs.
>

I was jumping ahead a bit here.  I was thinking of the case where the
aggregation query was made a materialized view with a gin index on the
aggregated column.  It is not obvious that a "scalar operator
ANY(flightnums)" can't use an index on array_column, but it can't.  My
interest was more in the % operator from pg_trgm, but also the normal text
= operator would be nice to use here (as opposed to the much uglier <@ or
@> in which the scalar needs to be wrapped into a degenerate array.)



> Personally I'd suggest building a commutator operator (just need a
> one-liner SQL or plpgsql function as infrastructure) and away you go.
>

Right, something like:

create function like_rev (text, text) returns boolean as $$ select $2 like
$1 $$ language SQL;

create operator  (procedure = like_rev,  leftarg=text, rightarg=text);

You can explicitly specify the commutator but it doesn't seem to be
necessary to do so:

create operator  (procedure = like_rev,  leftarg=text,
rightarg=text,commutator = ~~ );



> > I think you're best bet is to do a subquery against the unaggregated
> table.
>
> > select * from aggregated a where exists
> >   (select 1 from unaggregated ua where a.lognum=ua.lognum and flightnum
> > like '8%')
>
> That would work too, but not sure about performance relative to the other
> way.
>

In my experience, if the subselect can use an index on the LIKE and if '8%'
is rare, then using the subselect will be vastly better.  And if it is
indexable or not rare, it is still likely to be better, or at least not
worse by much. Disaggregating every array for every row to do the ANY is
pretty inefficient.  Particularly if you are not using a materialized view,
and so have to first aggregate it.

Of course good enough is good enough, so if scalar  ANY(array)  is good
enough...

Cheers,

Jeff


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Israel Brewster

---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD


On Oct 12, 2015, at 11:50 AM, Jeff Janes  wrote:On Mon, Oct 12, 2015 at 11:39 AM, Tom Lane  wrote:Jeff Janes  writes:
> On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster 
> wrote:
>> My first thought was to do something like this:
>>
>> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs
>> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
>>
>> But while this doesn't give an error, it also doesn't return any results.
>> I'm guessing that this is because the wildcard is on the left of the
>> operator, and needs to be on the right.

> Right.  The LIKE operator does not have a commutator by default.  (And if
> you created one for it, it could not use an index in this case.)

Well, it couldn't use an index anyway, given that the query as written
wants to collect groups if *any* member is LIKE '8%', rather than
restricting the data to such flightnums before aggregation occurs.I was jumping ahead a bit here.  I was thinking of the case where the aggregation query was made a materialized view with a gin index on the aggregated column.  It is not obvious that a "scalar operator ANY(flightnums)" can't use an index on array_column, but it can't.  My interest was more in the % operator from pg_trgm, but also the normal text = operator would be nice to use here (as opposed to the much uglier <@ or @> in which the scalar needs to be wrapped into a degenerate array.) 

Personally I'd suggest building a commutator operator (just need a
one-liner SQL or plpgsql function as infrastructure) and away you go.Right, something like:create function like_rev (text, text) returns boolean as $$ select $2 like $1 $$ language SQL;create operator  (procedure = like_rev,  leftarg=text, rightarg=text);...which actually works perfectly for my use case. No, it doesn't use an index, however even the worst case scenario on my data, where that is the ONLY criteria given, "only" takes about 10 seconds. Yes, that is a "long" time, however a) 99% of the time there will be other criteria used as well, drastically reducing the result set and speeding the query, and b) the query is used as part of a report generator, for which there isn't really a problem if the user has to wait a few seconds. You can explicitly specify the commutator but it doesn't seem to be necessary to do so:create operator  (procedure = like_rev,  leftarg=text, rightarg=text,commutator = ~~ );
> I think you're best bet is to do a subquery against the unaggregated table.

> select * from aggregated a where exists
>   (select 1 from unaggregated ua where a.lognum=ua.lognum and flightnum
> like '8%')

That would work too, but not sure about performance relative to the other
way.In my experience, if the subselect can use an index on the LIKE and if '8%' is rare, then using the subselect will be vastly better.  And if it is indexable or not rare, it is still likely to be better, or at least not worse by much. Disaggregating every array for every row to do the ANY is pretty inefficient.  Particularly if you are not using a materialized view, and so have to first aggregate it.Of course good enough is good enough, so if scalar  ANY(array)  is good enough...Exactly. I think I could make the sub-select work, with some tweaking, and as it could well improve performance noticeably I may well spend some time on it, but the commutator operator "just works" and integrates quite nicely with my existing query structure.Thanks for the help! Cheers,Jeff