Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-09-09 Thread Dilip Kumar
On Fri, Sep 9, 2016 at 6:51 PM, Tom Lane  wrote:
> Pushed with cosmetic adjustments --- mostly, I thought we needed some
> comments about the topic.

Okay, Thanks.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-09-09 Thread Tom Lane
Dilip Kumar  writes:
> Seems like a better option, done it this way..
> attaching the modified patch..

Pushed with cosmetic adjustments --- mostly, I thought we needed some
comments about the topic.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-09-08 Thread Dilip Kumar
On Thu, Sep 8, 2016 at 7:21 PM, Tom Lane  wrote:
> We could make it work like that without breaking the ABI if we were
> to add a NOERROR bit to the allowed "flags".  However, after looking
> around a bit I'm no longer convinced what I said above is a good idea.
> In particular, if we put the responsibility of reporting the error on
> callers then we'll lose the ability to distinguish "no such pg_type OID"
> from "type exists but it's only a shell".  So I'm now thinking it's okay
> to promote lookup_type_cache's elog to a full ereport, especially since
> it's already using ereport for some of the related error cases such as
> the shell-type failure.

Okay..
>
> That still leaves us with what to do for domain_in.  A really simple
> fix would be to move the InitDomainConstraintRef call before the
> getBaseTypeAndTypmod call, because the former fetches the typcache
> entry and would then benefit from lookup_type_cache's ereport.
> But that seems a bit magic.
>
> I'm tempted to propose that domain_state_setup start with
>
> typentry = lookup_type_cache(domainType, TYPECACHE_DOMAIN_INFO);
> if (typentry->typtype != TYPTYPE_DOMAIN)
> ereport(ERROR,
> (errcode(ERRCODE_DATATYPE_MISMATCH),
>  errmsg("type %s is not a domain",
> format_type_be(domainType;
>
> removing the need to verify that after getBaseTypeAndTypmod.

Seems like a better option, done it this way..
attaching the modified patch..
>
> The cache loading is basically free because InitDomainConstraintRef
> would do it anyway; so the extra cost here is only one dynahash search.
> You could imagine buying back those cycles by teaching the typcache
> to be able to cache the result of getBaseTypeAndTypmod, but I'm doubtful
> that we really care.  This whole setup sequence only happens once per
> query anyway.

Agreed.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


cache_lookup_failure_v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-09-08 Thread Tom Lane
Dilip Kumar  writes:
> On Thu, Sep 8, 2016 at 2:23 AM, Tom Lane  wrote:
>> In the case of record_in, it seems like it'd be easy enough to use
>> lookup_rowtype_tupdesc_noerror() instead of lookup_rowtype_tupdesc()
>> and then throw a user-facing error right there.

> Actually when we are passing invalid type to "record_in", error is
> thrown in "lookup_type_cache" function, and this function is not
> taking "no_error" input (it's only limited to
> lookup_rowtype_tupdesc_internal).

Ah, should have dug a bit deeper.

> One option is to pass "no_error" parameter to "lookup_type_cache"
> function also, and throw error only in record_in function.
> But problem is, "lookup_type_cache" has lot of references. And also
> will it be good idea to throw one generic error instead of multiple
> specific errors. ?

We could make it work like that without breaking the ABI if we were
to add a NOERROR bit to the allowed "flags".  However, after looking
around a bit I'm no longer convinced what I said above is a good idea.
In particular, if we put the responsibility of reporting the error on
callers then we'll lose the ability to distinguish "no such pg_type OID"
from "type exists but it's only a shell".  So I'm now thinking it's okay
to promote lookup_type_cache's elog to a full ereport, especially since
it's already using ereport for some of the related error cases such as
the shell-type failure.

That still leaves us with what to do for domain_in.  A really simple
fix would be to move the InitDomainConstraintRef call before the
getBaseTypeAndTypmod call, because the former fetches the typcache
entry and would then benefit from lookup_type_cache's ereport.
But that seems a bit magic.

I'm tempted to propose that domain_state_setup start with

typentry = lookup_type_cache(domainType, TYPECACHE_DOMAIN_INFO);
if (typentry->typtype != TYPTYPE_DOMAIN)
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
 errmsg("type %s is not a domain",
format_type_be(domainType;

removing the need to verify that after getBaseTypeAndTypmod.

The cache loading is basically free because InitDomainConstraintRef
would do it anyway; so the extra cost here is only one dynahash search.
You could imagine buying back those cycles by teaching the typcache
to be able to cache the result of getBaseTypeAndTypmod, but I'm doubtful
that we really care.  This whole setup sequence only happens once per
query anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-09-08 Thread Dilip Kumar
On Thu, Sep 8, 2016 at 2:23 AM, Tom Lane  wrote:
> I really don't like this solution.  Changing this one function, out of the
> dozens just like it in lsyscache.c, seems utterly random and arbitrary.
>
> I'm actually not especially convinced that passing domain_in a random
> OID needs to not produce an XX000 error.  That isn't a user-facing
> function and people shouldn't be calling it by hand at all.  The same
> goes for record_in.  But if we do want those cases to avoid this,
> I think it's appropriate to fix it in those functions, not decree
> that essentially-random other parts of the system should bear the
> responsibility.  (Thought experiment: if we changed the order of
> operations in domain_in so that getBaseTypeAndTypmod were no longer
> the first place to fail, would we undo this change and then change
> wherever the failure had moved to?  That's pretty messy.)
>
> In the case of record_in, it seems like it'd be easy enough to use
> lookup_rowtype_tupdesc_noerror() instead of lookup_rowtype_tupdesc()
> and then throw a user-facing error right there.

Actually when we are passing invalid type to "record_in", error is
thrown in "lookup_type_cache" function, and this function is not
taking "no_error" input (it's only limited to
lookup_rowtype_tupdesc_internal).

One option is to pass "no_error" parameter to "lookup_type_cache"
function also, and throw error only in record_in function.
But problem is, "lookup_type_cache" has lot of references. And also
will it be good idea to throw one generic error instead of multiple
specific errors. ?

Another option is to do same for "record_in" also what you have
suggested for domain_in (an extra syscache lookup). ?

>Not sure about a
> nice solution for domain_in.  We might have to resort to an extra
> syscache lookup there, but even if we did, it should happen only
> once per query so that's not very awful.
>
>> 3. CheckFunctionValidatorAccess: This is being called from all
>> language validator functions.
>
> This part seems reasonable, since the validator functions are documented
> as something users might call, and CheckFunctionValidatorAccess seems
> like an apropos place to handle it.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-09-07 Thread Tom Lane
Dilip Kumar  writes:
> Basically this patch changes error at three places.

> 1. getBaseTypeAndTypmod: This is being called from domain_in exposed
> function (domain_in->
> domain_state_setup-> getBaseTypeAndTypmod). Though this function is
> being called from many other places which are not exposed function,
> but I don't this will have any imapct, will this ?

I really don't like this solution.  Changing this one function, out of the
dozens just like it in lsyscache.c, seems utterly random and arbitrary.

I'm actually not especially convinced that passing domain_in a random
OID needs to not produce an XX000 error.  That isn't a user-facing
function and people shouldn't be calling it by hand at all.  The same
goes for record_in.  But if we do want those cases to avoid this,
I think it's appropriate to fix it in those functions, not decree
that essentially-random other parts of the system should bear the
responsibility.  (Thought experiment: if we changed the order of
operations in domain_in so that getBaseTypeAndTypmod were no longer
the first place to fail, would we undo this change and then change
wherever the failure had moved to?  That's pretty messy.)

In the case of record_in, it seems like it'd be easy enough to use
lookup_rowtype_tupdesc_noerror() instead of lookup_rowtype_tupdesc()
and then throw a user-facing error right there.  Not sure about a
nice solution for domain_in.  We might have to resort to an extra
syscache lookup there, but even if we did, it should happen only
once per query so that's not very awful.

> 3. CheckFunctionValidatorAccess: This is being called from all
> language validator functions.

This part seems reasonable, since the validator functions are documented
as something users might call, and CheckFunctionValidatorAccess seems
like an apropos place to handle it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-09-06 Thread Dilip Kumar
On Wed, Sep 7, 2016 at 8:52 AM, Haribabu Kommi  wrote:
> I reviewed and tested the patch. The changes are fine.
> This patch provides better error message compared to earlier.
>
> Marked the patch as "Ready for committer" in commit-fest.


Thanks for the review !

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-09-06 Thread Haribabu Kommi
On Fri, Aug 26, 2016 at 7:11 PM, Dilip Kumar  wrote:


> I have changed this in attached patch..
>

I reviewed and tested the patch. The changes are fine.
This patch provides better error message compared to earlier.

Marked the patch as "Ready for committer" in commit-fest.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-26 Thread Dilip Kumar
On Wed, Aug 17, 2016 at 6:10 PM, Robert Haas  wrote:
>> If you are making changes in plpgsql_validator(), then shouldn't we
>> make changes in plperl_validator() or plpython_validator()?  I see
>> that you have made changes to function CheckFunctionValidatorAccess()
>> which seems to be getting called from *_validator() (* indicates
>> plpgsql/plperl/plpython) functions.  Is there a reason for changing
>> the *_validator() function?

Yes you are right, making changes in CheckFunctionValidatorAccess is
sufficient, no need to make changes in *_validator (* indicates
plpgsql/plperl/plpython) functions.

I have changed this in attached patch..

>
> Yeah, when I glanced briefly at this patch, I found myself wondering
> whether all of these call sites were actually reachable (and why the
> patch contained no test cases to prove it).

I have also added test cases to cover all my changes.

Basically this patch changes error at three places.

1. getBaseTypeAndTypmod: This is being called from domain_in exposed
function (domain_in->
domain_state_setup-> getBaseTypeAndTypmod). Though this function is
being called from many other places which are not exposed function,
but I don't this will have any imapct, will this ?

2. lookup_type_cache: This is being called from record_in
(record_in->lookup_rowtype_tupdesc->
lookup_rowtype_tupdesc_internal->lookup_type_cache).

3. CheckFunctionValidatorAccess: This is being called from all
language validator functions.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


cache_lookup_failure_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-17 Thread Robert Haas
On Wed, Aug 17, 2016 at 7:25 AM, Amit Kapila  wrote:
> On Wed, Aug 10, 2016 at 11:27 AM, Dilip Kumar  wrote:
>> On Wed, Aug 10, 2016 at 10:04 AM, Dilip Kumar  wrote:
>>> This seems better, after checking at other places I found that for
>>> invalid type we are using ERRCODE_UNDEFINED_OBJECT and for invalid
>>> functions we are using ERRCODE_UNDEFINED_FUNCTION. So I have done the
>>> same way.
>>>
>>> Updated patch attached.
>>
>> I found some more places where we can get similar error and updated in
>> my v3 patch.
>>
>
> @@ -412,7 +412,9 @@ plpgsql_validator(PG_FUNCTION_ARGS)
>   /* Get the new function's pg_proc entry */
>   tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcoid));
>   if (!HeapTupleIsValid(tuple))
> - elog(ERROR, "cache lookup failed for function %u", funcoid);
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_FUNCTION),
> + errmsg("function with OID %u does not exist", funcoid)));
>
> If you are making changes in plpgsql_validator(), then shouldn't we
> make changes in plperl_validator() or plpython_validator()?  I see
> that you have made changes to function CheckFunctionValidatorAccess()
> which seems to be getting called from *_validator() (* indicates
> plpgsql/plperl/plpython) functions.  Is there a reason for changing
> the *_validator() function?

Yeah, when I glanced briefly at this patch, I found myself wondering
whether all of these call sites were actually reachable (and why the
patch contained no test cases to prove it).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-17 Thread Amit Kapila
On Wed, Aug 10, 2016 at 11:27 AM, Dilip Kumar  wrote:
> On Wed, Aug 10, 2016 at 10:04 AM, Dilip Kumar  wrote:
>> This seems better, after checking at other places I found that for
>> invalid type we are using ERRCODE_UNDEFINED_OBJECT and for invalid
>> functions we are using ERRCODE_UNDEFINED_FUNCTION. So I have done the
>> same way.
>>
>> Updated patch attached.
>
> I found some more places where we can get similar error and updated in
> my v3 patch.
>

@@ -412,7 +412,9 @@ plpgsql_validator(PG_FUNCTION_ARGS)
  /* Get the new function's pg_proc entry */
  tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcoid));
  if (!HeapTupleIsValid(tuple))
- elog(ERROR, "cache lookup failed for function %u", funcoid);
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_FUNCTION),
+ errmsg("function with OID %u does not exist", funcoid)));

If you are making changes in plpgsql_validator(), then shouldn't we
make changes in plperl_validator() or plpython_validator()?  I see
that you have made changes to function CheckFunctionValidatorAccess()
which seems to be getting called from *_validator() (* indicates
plpgsql/plperl/plpython) functions.  Is there a reason for changing
the *_validator() function?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-09 Thread Dilip Kumar
On Wed, Aug 10, 2016 at 10:04 AM, Dilip Kumar  wrote:
> This seems better, after checking at other places I found that for
> invalid type we are using ERRCODE_UNDEFINED_OBJECT and for invalid
> functions we are using ERRCODE_UNDEFINED_FUNCTION. So I have done the
> same way.
>
> Updated patch attached.

I found some more places where we can get similar error and updated in
my v3 patch.

I don't claim that it fixes all such kind of error, but at least it
fixes error reported by sqlsmith plus some additional what I found in
similar area.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


cache_lookup_failure_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-09 Thread Dilip Kumar
On Tue, Aug 9, 2016 at 6:49 PM, Amit Kapila  wrote:
> Okay, then how about ERRCODE_UNDEFINED_OBJECT?  It seems to be used in
> almost similar cases.

This seems better, after checking at other places I found that for
invalid type we are using ERRCODE_UNDEFINED_OBJECT and for invalid
functions we are using ERRCODE_UNDEFINED_FUNCTION. So I have done the
same way.

Updated patch attached.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


cache_lookup_failure_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-09 Thread Amit Kapila
On Mon, Aug 8, 2016 at 6:00 PM, Dilip Kumar  wrote:
> On Mon, Aug 8, 2016 at 5:23 PM, Amit Kapila  wrote:
>>
>>
>> Did you consider to use ERRCODE_UNDEFINED_COLUMN with error messages
>> like: "type %u does not exit" or "type id %u does not exit"? Errorcode
>> ERRCODE_UNDEFINED_COLUMN seems to be used for syscache lookup failure
>> cases at certain places in code.
>
>
> But I think, OBJECT will be more appropriate than COLUMN at this place.
>

Okay, then how about ERRCODE_UNDEFINED_OBJECT?  It seems to be used in
almost similar cases.

> However I can change error message to "type id %u does not exit" if this
> seems better ?
>

I think that is better.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-08 Thread Dilip Kumar
On Mon, Aug 8, 2016 at 5:23 PM, Amit Kapila  wrote:

> >
> > Your other options and the option you choose are same.
> >
>

Sorry typo, I meant ERRCODE_INVALID_OBJECT_DEFINITION.



> Did you consider to use ERRCODE_UNDEFINED_COLUMN with error messages
> like: "type %u does not exit" or "type id %u does not exit"? Errorcode
> ERRCODE_UNDEFINED_COLUMN seems to be used for syscache lookup failure
> cases at certain places in code.


But I think, OBJECT will be more appropriate than COLUMN at this place.

However I can change error message to "type id %u does not exit" if this
seems better ?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-08 Thread Amit Kapila
On Mon, Aug 8, 2016 at 5:11 PM, Amit Kapila  wrote:
> On Mon, Aug 8, 2016 at 4:58 PM, Dilip Kumar  wrote:
>>
>> On Wed, Aug 3, 2016 at 5:35 AM, Robert Haas  wrote:
>>>
>>> I think that's just making life difficult.  If nothing else, sqlsmith
>>> hunts around for functions it can call that return internal errors,
>>> and if we refuse to fix all of them to return user-facing errors, then
>>> it's just crap for the people running sqlsmith to sift through and
>>> it's a judgment call whether to fix each particular case.  Even aside
>>> from that, I think it's much better to have a clear and unambiguous
>>> rule that elog is only for can't-happen things, not
>>> we-don't-recommend-it things.
>>
>>
>> I have changed for all these function to report more appropriate error with
>> ereport.
>>
>> I used ERRCODE_WRONG_OBJECT_TYPE error code for reporting such errors.
>> I think this is closest error code among all existing error codes, other
>> options can be (ERRCODE_WRONG_OBJECT_TYPE).
>>
>
> Your other options and the option you choose are same.
>

Did you consider to use ERRCODE_UNDEFINED_COLUMN with error messages
like: "type %u does not exit" or "type id %u does not exit"? Errorcode
ERRCODE_UNDEFINED_COLUMN seems to be used for syscache lookup failure
cases at certain places in code.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-08 Thread Amit Kapila
On Mon, Aug 8, 2016 at 4:58 PM, Dilip Kumar  wrote:
>
> On Wed, Aug 3, 2016 at 5:35 AM, Robert Haas  wrote:
>>
>> I think that's just making life difficult.  If nothing else, sqlsmith
>> hunts around for functions it can call that return internal errors,
>> and if we refuse to fix all of them to return user-facing errors, then
>> it's just crap for the people running sqlsmith to sift through and
>> it's a judgment call whether to fix each particular case.  Even aside
>> from that, I think it's much better to have a clear and unambiguous
>> rule that elog is only for can't-happen things, not
>> we-don't-recommend-it things.
>
>
> I have changed for all these function to report more appropriate error with
> ereport.
>
> I used ERRCODE_WRONG_OBJECT_TYPE error code for reporting such errors.
> I think this is closest error code among all existing error codes, other
> options can be (ERRCODE_WRONG_OBJECT_TYPE).
>

Your other options and the option you choose are same.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-08 Thread Dilip Kumar
On Wed, Aug 3, 2016 at 5:35 AM, Robert Haas  wrote:

> I think that's just making life difficult.  If nothing else, sqlsmith
> hunts around for functions it can call that return internal errors,
> and if we refuse to fix all of them to return user-facing errors, then
> it's just crap for the people running sqlsmith to sift through and
> it's a judgment call whether to fix each particular case.  Even aside
> from that, I think it's much better to have a clear and unambiguous
> rule that elog is only for can't-happen things, not
> we-don't-recommend-it things.
>

I have changed for all these function to report more appropriate error with
ereport.

I used ERRCODE_WRONG_OBJECT_TYPE error code for reporting such errors.
I think this is closest error code among all existing error codes, other
options can be (ERRCODE_WRONG_OBJECT_TYPE).

But I think ERRCODE_WRONG_OBJECT_TYPE is better option.

Patch attached for the same.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


cache_lookup_failure_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-02 Thread Peter Geoghegan
On Tue, Aug 2, 2016 at 5:05 PM, Robert Haas  wrote:
> I think that's just making life difficult.  If nothing else, sqlsmith
> hunts around for functions it can call that return internal errors,
> and if we refuse to fix all of them to return user-facing errors, then
> it's just crap for the people running sqlsmith to sift through and
> it's a judgment call whether to fix each particular case.  Even aside
> from that, I think it's much better to have a clear and unambiguous
> rule that elog is only for can't-happen things, not
> we-don't-recommend-it things.

+1.

This also has value in the context of automatically surfacing
situations where "can't happen" errors do in fact happen at scale.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-02 Thread Robert Haas
On Tue, Aug 2, 2016 at 7:16 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Aug 2, 2016 at 6:03 AM, Dilip Kumar  wrote:
>>> There are many more such exposed functions, which can throw cache lookup
>>> failure error if we pass wrong value.
>>>
>>> i.e.
>>> record_in
>>> domain_in
>>> fmgr_c_validator
>>> plpgsql_validator
>>> pg_procedure_is_visible
>>>
>>> Are we planning to change these also..
>
>> I think if they are SQL-callable functions that users can invoke from
>> a SQL prompt, they shouldn't throw "cache lookup failed" errors or,
>> for that matter, any other error that is reported with elog().
>
> FWIW, I would restrict that to "functions that users are *intended* to
> call from a SQL prompt".  If you are fooling around calling internal
> functions with garbage input, you should not be surprised to get an
> internal error back.  So of the above list, only pg_procedure_is_visible
> seems particularly worth changing to me.  And for it, returning NULL
> (ie, "unknown") seems reasonable enough.

I think that's just making life difficult.  If nothing else, sqlsmith
hunts around for functions it can call that return internal errors,
and if we refuse to fix all of them to return user-facing errors, then
it's just crap for the people running sqlsmith to sift through and
it's a judgment call whether to fix each particular case.  Even aside
from that, I think it's much better to have a clear and unambiguous
rule that elog is only for can't-happen things, not
we-don't-recommend-it things.

> There's no such animal as pg_procedure_is_visible.  I suspect that's an
> EDB-ism that hasn't caught up with commit 66bb74dbe8206a35.

Yep.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-02 Thread Tom Lane
Robert Haas  writes:
> On Tue, Aug 2, 2016 at 6:03 AM, Dilip Kumar  wrote:
>> There are many more such exposed functions, which can throw cache lookup
>> failure error if we pass wrong value.
>> 
>> i.e.
>> record_in
>> domain_in
>> fmgr_c_validator
>> plpgsql_validator
>> pg_procedure_is_visible
>> 
>> Are we planning to change these also..

> I think if they are SQL-callable functions that users can invoke from
> a SQL prompt, they shouldn't throw "cache lookup failed" errors or,
> for that matter, any other error that is reported with elog().

FWIW, I would restrict that to "functions that users are *intended* to
call from a SQL prompt".  If you are fooling around calling internal
functions with garbage input, you should not be surprised to get an
internal error back.  So of the above list, only pg_procedure_is_visible
seems particularly worth changing to me.  And for it, returning NULL
(ie, "unknown") seems reasonable enough.

Actually, it looks to me like we already fixed that for the built-in
is_visible functions:

# select pg_function_is_visible(0) is null;
 ?column? 
--
 t
(1 row)

There's no such animal as pg_procedure_is_visible.  I suspect that's an
EDB-ism that hasn't caught up with commit 66bb74dbe8206a35.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-02 Thread Robert Haas
On Tue, Aug 2, 2016 at 6:03 AM, Dilip Kumar  wrote:
> There are many more such exposed functions, which can throw cache lookup
> failure error if we pass wrong value.
>
> i.e.
> record_in
> domain_in
> fmgr_c_validator
> plpgsql_validator
> pg_procedure_is_visible
>
> Are we planning to change these also..

I think if they are SQL-callable functions that users can invoke from
a SQL prompt, they shouldn't throw "cache lookup failed" errors or,
for that matter, any other error that is reported with elog().  Now, I
don't necessarily think that these functions should return NULL in
that case the way we did with the others; that's not a sensible
behavior for a type-input function AFAIK.  But we should emit a better
error.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-02 Thread Dilip Kumar
On Tue, Aug 2, 2016 at 3:33 PM, Dilip Kumar  wrote:

> There are many more such exposed functions, which can throw cache lookup
> failure error if we pass wrong value.
>
> i.e.
> record_in
> domain_in
> fmgr_c_validator
> edb_get_objecttypeheaddef
> plpgsql_validator
> pg_procedure_is_visible
>
> Are we planning to change these also..
>
> below query is one example (from sqlsmith).
>
> ERROR:  cache lookup failed for procedure 0
>
> postgres=# select
>
> postgres-# pg_catalog.record_in(
>
> postgres(#   cast(pg_catalog.numerictypmodout(
>
> postgres(# cast(98 as integer)) as cstring),
>
> postgres(#   cast(1 as oid),
>
> postgres(#   cast(35 as integer));
>
> ERROR:  cache lookup failed for type 1
>
By mistake I mentioned edb_get_objecttypeheaddef function also, please
ignore that.

So actual list is as below..

record_in
domain_in
fmgr_c_validator
plpgsql_validator
pg_procedure_is_visible

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-02 Thread Dilip Kumar
On Sat, Jul 30, 2016 at 4:27 AM, Michael Paquier 
wrote:

> OK for me. Thanks for the commit.


There are many more such exposed functions, which can throw cache lookup
failure error if we pass wrong value.

i.e.
record_in
domain_in
fmgr_c_validator
edb_get_objecttypeheaddef
plpgsql_validator
pg_procedure_is_visible

Are we planning to change these also..

below query is one example (from sqlsmith).

ERROR:  cache lookup failed for procedure 0

postgres=# select

postgres-# pg_catalog.record_in(

postgres(#   cast(pg_catalog.numerictypmodout(

postgres(# cast(98 as integer)) as cstring),

postgres(#   cast(1 as oid),

postgres(#   cast(35 as integer));

ERROR:  cache lookup failed for type 1


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-07-29 Thread Michael Paquier
On Sat, Jul 30, 2016 at 1:17 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Jul 26, 2016 at 9:27 PM, Michael Paquier
>>  wrote:
>>> While looking at the series of functions pg_get_*, I have noticed as
>>> well that pg_get_userbyid() returns "unknown (OID=%u)" when it does
>>> not know a user. Perhaps we'd want to change that to NULL for
>>> consistency with the rest?
>
>> That's probably correct in theory, but it's a little less closely
>> related, and I'm not entirely sure how far we want to go with this.
>> Remember, the original purpose was to avoid having an internal error
>> (cache lookup failed, XX000) exposed as a user-visible error message.
>> Are we at risk from veering from actual bug-fixing off into useless
>> tinkering?  Not sure.
>
> I'd vote for leaving that one alone; yeah, it's a bit inconsistent
> now, but no one has complained about its behavior.

OK for me. Thanks for the commit.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-07-29 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jul 26, 2016 at 9:27 PM, Michael Paquier
>  wrote:
>> While looking at the series of functions pg_get_*, I have noticed as
>> well that pg_get_userbyid() returns "unknown (OID=%u)" when it does
>> not know a user. Perhaps we'd want to change that to NULL for
>> consistency with the rest?

> That's probably correct in theory, but it's a little less closely
> related, and I'm not entirely sure how far we want to go with this.
> Remember, the original purpose was to avoid having an internal error
> (cache lookup failed, XX000) exposed as a user-visible error message.
> Are we at risk from veering from actual bug-fixing off into useless
> tinkering?  Not sure.

I'd vote for leaving that one alone; yeah, it's a bit inconsistent
now, but no one has complained about its behavior.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-07-29 Thread Robert Haas
On Tue, Jul 26, 2016 at 9:27 PM, Michael Paquier
 wrote:
> On Wed, Jul 27, 2016 at 5:11 AM, Robert Haas  wrote:
>> Committed with minor kibitizing: you don't need an "else" after a
>> statement that transfers control out of the function.
>
> Thanks. Right, I forgot that.
>
>> Shouldn't
>> pg_get_function_arguments, pg_get_function_identity_arguments,
>> pg_get_function_result, and pg_get_function_arg_default get the same
>> treatment?
>
> Changing all of them make sense. Please see attached.

Committed.

> While looking at the series of functions pg_get_*, I have noticed as
> well that pg_get_userbyid() returns "unknown (OID=%u)" when it does
> not know a user. Perhaps we'd want to change that to NULL for
> consistency with the rest?

That's probably correct in theory, but it's a little less closely
related, and I'm not entirely sure how far we want to go with this.
Remember, the original purpose was to avoid having an internal error
(cache lookup failed, XX000) exposed as a user-visible error message.
Are we at risk from veering from actual bug-fixing off into useless
tinkering?  Not sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-07-26 Thread Michael Paquier
On Wed, Jul 27, 2016 at 5:11 AM, Robert Haas  wrote:
> Committed with minor kibitizing: you don't need an "else" after a
> statement that transfers control out of the function.

Thanks. Right, I forgot that.

> Shouldn't
> pg_get_function_arguments, pg_get_function_identity_arguments,
> pg_get_function_result, and pg_get_function_arg_default get the same
> treatment?

Changing all of them make sense. Please see attached.

While looking at the series of functions pg_get_*, I have noticed as
well that pg_get_userbyid() returns "unknown (OID=%u)" when it does
not know a user. Perhaps we'd want to change that to NULL for
consistency with the rest?
-- 
Michael
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 2915e21..51d0c23 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -2238,11 +2238,11 @@ pg_get_function_arguments(PG_FUNCTION_ARGS)
 	StringInfoData buf;
 	HeapTuple	proctup;
 
-	initStringInfo();
-
 	proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
 	if (!HeapTupleIsValid(proctup))
-		elog(ERROR, "cache lookup failed for function %u", funcid);
+		PG_RETURN_NULL();
+
+	initStringInfo();
 
 	(void) print_function_arguments(, proctup, false, true);
 
@@ -2264,11 +2264,11 @@ pg_get_function_identity_arguments(PG_FUNCTION_ARGS)
 	StringInfoData buf;
 	HeapTuple	proctup;
 
-	initStringInfo();
-
 	proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
 	if (!HeapTupleIsValid(proctup))
-		elog(ERROR, "cache lookup failed for function %u", funcid);
+		PG_RETURN_NULL();
+
+	initStringInfo();
 
 	(void) print_function_arguments(, proctup, false, false);
 
@@ -2289,11 +2289,11 @@ pg_get_function_result(PG_FUNCTION_ARGS)
 	StringInfoData buf;
 	HeapTuple	proctup;
 
-	initStringInfo();
-
 	proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
 	if (!HeapTupleIsValid(proctup))
-		elog(ERROR, "cache lookup failed for function %u", funcid);
+		PG_RETURN_NULL();
+
+	initStringInfo();
 
 	print_function_rettype(, proctup);
 
@@ -2547,7 +2547,7 @@ pg_get_function_arg_default(PG_FUNCTION_ARGS)
 
 	proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
 	if (!HeapTupleIsValid(proctup))
-		elog(ERROR, "cache lookup failed for function %u", funcid);
+		PG_RETURN_NULL();
 
 	numargs = get_func_arg_info(proctup, , , );
 	if (nth_arg < 1 || nth_arg > numargs || !is_input_argument(nth_arg - 1, argmodes))
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 7c633ac..c5ff318 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -3094,3 +3094,33 @@ SELECT pg_get_viewdef(0);
  
 (1 row)
 
+SELECT pg_get_function_arguments(0);
+ pg_get_function_arguments 
+---
+ 
+(1 row)
+
+SELECT pg_get_function_identity_arguments(0);
+ pg_get_function_identity_arguments 
+
+ 
+(1 row)
+
+SELECT pg_get_function_result(0);
+ pg_get_function_result 
+
+ 
+(1 row)
+
+SELECT pg_get_function_arg_default(0, 0);
+ pg_get_function_arg_default 
+-
+ 
+(1 row)
+
+SELECT pg_get_function_arg_default('pg_class'::regclass, 0);
+ pg_get_function_arg_default 
+-
+ 
+(1 row)
+
diff --git a/src/test/regress/sql/rules.sql b/src/test/regress/sql/rules.sql
index 111c9ba..835945f 100644
--- a/src/test/regress/sql/rules.sql
+++ b/src/test/regress/sql/rules.sql
@@ -1152,3 +1152,8 @@ SELECT pg_get_indexdef(0);
 SELECT pg_get_ruledef(0);
 SELECT pg_get_triggerdef(0);
 SELECT pg_get_viewdef(0);
+SELECT pg_get_function_arguments(0);
+SELECT pg_get_function_identity_arguments(0);
+SELECT pg_get_function_result(0);
+SELECT pg_get_function_arg_default(0, 0);
+SELECT pg_get_function_arg_default('pg_class'::regclass, 0);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-07-26 Thread Robert Haas
On Mon, Jun 6, 2016 at 3:30 AM, Michael Paquier
 wrote:
> On Sun, Jun 5, 2016 at 11:16 PM, Tom Lane  wrote:
>> Michael Paquier  writes:
>>> Still, on back branches I think that it would be a good idea to have a
>>> better error message for the ones that already throw an error, like
>>> "trigger with OID %u does not exist". Switching from elog to ereport
>>> would be a good idea, but wouldn't it be a problem to change what is
>>> user-visible?
>>
>> If we're going to touch this behavior in the back branches, I would
>> vote for changing it the same as we do in HEAD.  I do not think that
>> making a user-visible behavior change in a minor release and then a
>> different change in the next major is good strategy.
>
> So, I have finished with the patch attached that changes all the
> *def() functions to return NULL when an object does not exist. Some
> callers of the index and constraint definitions still expect a cache
> lookup error if the object does not exist, and this patch is careful
> about that. I think that it would be nice to get that in 9.6, but I
> won't fight hard for it either. So I am parking it for now in the
> upcoming CF.
>
>> But, given the relative shortage of complaints from the field, I'd
>> be more inclined to just do nothing in back branches.  There might
>> be people out there with code depending on the current behavior,
>> and they'd be annoyed if we change it in a minor release.
>
> Sure. That's the safest position. Thinking about the existing behavior
> for some of the index and constraint callers, even just changing the
> error message does not bring much as some of them really expect to
> fail with a cache lookup error.

Committed with minor kibitizing: you don't need an "else" after a
statement that transfers control out of the function.  Shouldn't
pg_get_function_arguments, pg_get_function_identity_arguments,
pg_get_function_result, and pg_get_function_arg_default get the same
treatment?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-06-06 Thread Michael Paquier
On Sun, Jun 5, 2016 at 11:16 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> Still, on back branches I think that it would be a good idea to have a
>> better error message for the ones that already throw an error, like
>> "trigger with OID %u does not exist". Switching from elog to ereport
>> would be a good idea, but wouldn't it be a problem to change what is
>> user-visible?
>
> If we're going to touch this behavior in the back branches, I would
> vote for changing it the same as we do in HEAD.  I do not think that
> making a user-visible behavior change in a minor release and then a
> different change in the next major is good strategy.

So, I have finished with the patch attached that changes all the
*def() functions to return NULL when an object does not exist. Some
callers of the index and constraint definitions still expect a cache
lookup error if the object does not exist, and this patch is careful
about that. I think that it would be nice to get that in 9.6, but I
won't fight hard for it either. So I am parking it for now in the
upcoming CF.

> But, given the relative shortage of complaints from the field, I'd
> be more inclined to just do nothing in back branches.  There might
> be people out there with code depending on the current behavior,
> and they'd be annoyed if we change it in a minor release.

Sure. That's the safest position. Thinking about the existing behavior
for some of the index and constraint callers, even just changing the
error message does not bring much as some of them really expect to
fail with a cache lookup error.
-- 
Michael
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 8cb3075..0651cb2 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -314,9 +314,9 @@ static char *pg_get_ruledef_worker(Oid ruleoid, int prettyFlags);
 static char *pg_get_indexdef_worker(Oid indexrelid, int colno,
 	   const Oid *excludeOps,
 	   bool attrsOnly, bool showTblSpc,
-	   int prettyFlags);
+	   int prettyFlags, bool missing_ok);
 static char *pg_get_constraintdef_worker(Oid constraintId, bool fullCommand,
-			int prettyFlags);
+			int prettyFlags, bool missing_ok);
 static text *pg_get_expr_worker(text *expr, Oid relid, const char *relname,
    int prettyFlags);
 static int print_function_arguments(StringInfo buf, HeapTuple proctup,
@@ -466,9 +466,16 @@ pg_get_ruledef(PG_FUNCTION_ARGS)
 {
 	Oid			ruleoid = PG_GETARG_OID(0);
 	int			prettyFlags;
+	char	   *res;
 
 	prettyFlags = PRETTYFLAG_INDENT;
-	PG_RETURN_TEXT_P(string_to_text(pg_get_ruledef_worker(ruleoid, prettyFlags)));
+
+	res = pg_get_ruledef_worker(ruleoid, prettyFlags);
+
+	if (res == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_TEXT_P(string_to_text(res));
 }
 
 
@@ -478,9 +485,16 @@ pg_get_ruledef_ext(PG_FUNCTION_ARGS)
 	Oid			ruleoid = PG_GETARG_OID(0);
 	bool		pretty = PG_GETARG_BOOL(1);
 	int			prettyFlags;
+	char	   *res;
 
 	prettyFlags = pretty ? PRETTYFLAG_PAREN | PRETTYFLAG_INDENT : PRETTYFLAG_INDENT;
-	PG_RETURN_TEXT_P(string_to_text(pg_get_ruledef_worker(ruleoid, prettyFlags)));
+
+	res = pg_get_ruledef_worker(ruleoid, prettyFlags);
+
+	if (res == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_TEXT_P(string_to_text(res));
 }
 
 
@@ -532,7 +546,12 @@ pg_get_ruledef_worker(Oid ruleoid, int prettyFlags)
 	if (spirc != SPI_OK_SELECT)
 		elog(ERROR, "failed to get pg_rewrite tuple for rule %u", ruleoid);
 	if (SPI_processed != 1)
-		appendStringInfoChar(, '-');
+	{
+		/*
+		 * There is no tuple data available here, just keep the output buffer
+		 * empty.
+		 */
+	}
 	else
 	{
 		/*
@@ -549,6 +568,9 @@ pg_get_ruledef_worker(Oid ruleoid, int prettyFlags)
 	if (SPI_finish() != SPI_OK_FINISH)
 		elog(ERROR, "SPI_finish failed");
 
+	if (buf.len == 0)
+		return NULL;
+
 	return buf.data;
 }
 
@@ -564,9 +586,16 @@ pg_get_viewdef(PG_FUNCTION_ARGS)
 	/* By OID */
 	Oid			viewoid = PG_GETARG_OID(0);
 	int			prettyFlags;
+	char	   *res;
 
 	prettyFlags = PRETTYFLAG_INDENT;
-	PG_RETURN_TEXT_P(string_to_text(pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT)));
+
+	res = pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT);
+
+	if (res == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_TEXT_P(string_to_text(res));
 }
 
 
@@ -577,9 +606,16 @@ pg_get_viewdef_ext(PG_FUNCTION_ARGS)
 	Oid			viewoid = PG_GETARG_OID(0);
 	bool		pretty = PG_GETARG_BOOL(1);
 	int			prettyFlags;
+	char	   *res;
 
 	prettyFlags = pretty ? PRETTYFLAG_PAREN | PRETTYFLAG_INDENT : PRETTYFLAG_INDENT;
-	PG_RETURN_TEXT_P(string_to_text(pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT)));
+
+	res = pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT);
+
+	if (res == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_TEXT_P(string_to_text(res));
 }
 
 Datum
@@ -589,10 +625,17 @@ pg_get_viewdef_wrap(PG_FUNCTION_ARGS)
 	Oid			viewoid = PG_GETARG_OID(0);
 	int			wrap = PG_GETARG_INT32(1);
 	int	

Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-06-05 Thread Tom Lane
Michael Paquier  writes:
> Still, on back branches I think that it would be a good idea to have a
> better error message for the ones that already throw an error, like
> "trigger with OID %u does not exist". Switching from elog to ereport
> would be a good idea, but wouldn't it be a problem to change what is
> user-visible?

If we're going to touch this behavior in the back branches, I would
vote for changing it the same as we do in HEAD.  I do not think that
making a user-visible behavior change in a minor release and then a
different change in the next major is good strategy.

But, given the relative shortage of complaints from the field, I'd
be more inclined to just do nothing in back branches.  There might
be people out there with code depending on the current behavior,
and they'd be annoyed if we change it in a minor release.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-06-05 Thread Michael Paquier
On Sun, Jun 5, 2016 at 4:28 PM, Michael Paquier
 wrote:
> On Sun, Jun 5, 2016 at 12:29 AM, Tom Lane  wrote:
>> Michael Paquier  writes:
>>> This is still failing:
>>> =# select indexdef from pg_catalog.pg_indexes where indexdef is not NULL;
>>> ERROR:  XX000: cache lookup failed for index 2619
>>> LOCATION:  pg_get_indexdef_worker, ruleutils.c:1054
>>> Do we want to improve at least the error message?
>>
>> Maybe we should just make the function silently return NULL if the OID
>> doesn't refer to an index relation.  There's precedent for that sort
>> of behavior ...
>
> For views it is simply returned "Not a view", and for rules that's
> "-". All the others behave like similarly to indexes:
> =# select pg_get_constraintdef(0);
> ERROR:  XX000: cache lookup failed for constraint 0
> =# select pg_get_triggerdef(0);
> ERROR:  XX000: could not find tuple for trigger 0
> =# select pg_get_functiondef(0);
> ERROR:  XX000: cache lookup failed for function 0
> =# select pg_get_viewdef(0);
>  pg_get_viewdef
> 
>  Not a view
> (1 row)
> =# select pg_get_ruledef(0);
>  pg_get_ruledef
> 
>  -
> (1 row)
>
> So yes we could just return NULL for all of them, or make them throw
> an error. Anyway, my vote goes for a HEAD-only change, and just throw
> NULL... This way an application still has the option to fallback to
> something else with a CASE at SQL level without using plpgsql to catch
> the error.
>
> Also, I have not monitored the code much regarding some code paths
> that rely on the existing rules, but I would imagine that there are
> things depending on the current behavior as well.

Still, on back branches I think that it would be a good idea to have a
better error message for the ones that already throw an error, like
"trigger with OID %u does not exist". Switching from elog to ereport
would be a good idea, but wouldn't it be a problem to change what is
user-visible?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-06-05 Thread Michael Paquier
On Sun, Jun 5, 2016 at 12:29 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> This is still failing:
>> =# select indexdef from pg_catalog.pg_indexes where indexdef is not NULL;
>> ERROR:  XX000: cache lookup failed for index 2619
>> LOCATION:  pg_get_indexdef_worker, ruleutils.c:1054
>> Do we want to improve at least the error message?
>
> Maybe we should just make the function silently return NULL if the OID
> doesn't refer to an index relation.  There's precedent for that sort
> of behavior ...

For views it is simply returned "Not a view", and for rules that's
"-". All the others behave like similarly to indexes:
=# select pg_get_constraintdef(0);
ERROR:  XX000: cache lookup failed for constraint 0
=# select pg_get_triggerdef(0);
ERROR:  XX000: could not find tuple for trigger 0
=# select pg_get_functiondef(0);
ERROR:  XX000: cache lookup failed for function 0
=# select pg_get_viewdef(0);
 pg_get_viewdef

 Not a view
(1 row)
=# select pg_get_ruledef(0);
 pg_get_ruledef

 -
(1 row)

So yes we could just return NULL for all of them, or make them throw
an error. Anyway, my vote goes for a HEAD-only change, and just throw
NULL... This way an application still has the option to fallback to
something else with a CASE at SQL level without using plpgsql to catch
the error.

Also, I have not monitored the code much regarding some code paths
that rely on the existing rules, but I would imagine that there are
things depending on the current behavior as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-06-04 Thread Tom Lane
Michael Paquier  writes:
> This is still failing:
> =# select indexdef from pg_catalog.pg_indexes where indexdef is not NULL;
> ERROR:  XX000: cache lookup failed for index 2619
> LOCATION:  pg_get_indexdef_worker, ruleutils.c:1054
> Do we want to improve at least the error message?

Maybe we should just make the function silently return NULL if the OID
doesn't refer to an index relation.  There's precedent for that sort
of behavior ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-06-04 Thread Michael Paquier
On Fri, Aug 7, 2015 at 9:21 AM, Tom Lane  wrote:
> Andreas Seltenreich  writes:
>> Tom Lane writes:
>>> On 08/01/2015 05:59 PM, Tom Lane wrote:
 Well, I certainly think all of these represent bugs:
 1 | ERROR:  could not find pathkey item to sort
>
>>> Hmm ... I see no error with these queries as of today's HEAD or
>>> back-branch tips.  I surmise that this was triggered by one of the other
>>> recently-fixed bugs, though the connection isn't obvious offhand.
>
>> I still see this error in master as of b8cbe43, but the queries are
>> indeed a pita to reproduce.  The one below is the only one so far that
>> is robust against running ANALYZE on the regression db, and also
>> reproduces when I run it as an EXTRA_TEST with make check.
>
> Got that one, thanks!  But we've seen this error message before in all
> sorts of weird situations, so I wouldn't assume that all the cases you've
> come across had the same cause.

(resurrecting an old thread)

This is still failing:
=# select indexdef from pg_catalog.pg_indexes where indexdef is not NULL;
ERROR:  XX000: cache lookup failed for index 2619
LOCATION:  pg_get_indexdef_worker, ruleutils.c:1054
Do we want to improve at least the error message?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-06 Thread Tom Lane
Andreas Seltenreich seltenre...@gmx.de writes:
 Tom Lane writes:
 On 08/01/2015 05:59 PM, Tom Lane wrote:
 Well, I certainly think all of these represent bugs:
 1 | ERROR:  could not find pathkey item to sort

 Hmm ... I see no error with these queries as of today's HEAD or
 back-branch tips.  I surmise that this was triggered by one of the other
 recently-fixed bugs, though the connection isn't obvious offhand.

 I still see this error in master as of b8cbe43, but the queries are
 indeed a pita to reproduce.  The one below is the only one so far that
 is robust against running ANALYZE on the regression db, and also
 reproduces when I run it as an EXTRA_TEST with make check.

Got that one, thanks!  But we've seen this error message before in all
sorts of weird situations, so I wouldn't assume that all the cases you've
come across had the same cause.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-05 Thread Tom Lane
Piotr Stefaniak postg...@piotr-stefaniak.me writes:
 On 08/05/2015 02:24 AM, Tom Lane wrote:
 Piotr Stefaniak postg...@piotr-stefaniak.me writes:
 Set join_collapse_limit = 32 and you should see the error.

 Ah ... now I get that error on the smaller query, but the larger one
 (that you put in an attachment) still doesn't show any problem.
 Do you have any other nondefault settings?

 Sorry, that needs from_collapse_limit = 32 as well.

Yeah, I assumed as much, but it still doesn't happen for me.  Possibly
something platform-dependent in statistics, or something like that.

Anyway, I fixed the problem exposed by the smaller query; would you
check that the larger one is okay for you now?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-05 Thread Andreas Seltenreich
Tom Lane writes:

 On 08/01/2015 05:59 PM, Tom Lane wrote:
 Well, I certainly think all of these represent bugs:
 1 | ERROR:  could not find pathkey item to sort

 Hmm ... I see no error with these queries as of today's HEAD or
 back-branch tips.  I surmise that this was triggered by one of the other
 recently-fixed bugs, though the connection isn't obvious offhand.

I still see this error in master as of b8cbe43, but the queries are
indeed a pita to reproduce.  The one below is the only one so far that
is robust against running ANALYZE on the regression db, and also
reproduces when I run it as an EXTRA_TEST with make check.

regards,
Andreas

select
  rel_217088662.a as c0,
  rel_217088554.a as c1,
  rel_217088662.b as c2,
  subq_34235266.c0 as c3,
  rel_217088660.id2 as c4,
  rel_217088660.id2 as c5
from
  public.clstr_tst as rel_217088554
inner join (select
   rel_217088628.a as c0
 from
   public.rtest_vview3 as rel_217088628
 where (rel_217088628.b !~ rel_217088628.b)
   and rel_217088628.b ~~* rel_217088628.b)
 or (rel_217088628.b ~* rel_217088628.b))
   or (rel_217088628.b  rel_217088628.b))
 or (rel_217088628.b = rel_217088628.b))) as subq_34235266
 inner join public.num_exp_mul as rel_217088660
   inner join public.onek2 as rel_217088661
   on (rel_217088660.id1 = rel_217088661.unique1 )
 on (subq_34235266.c0 = rel_217088660.id1 )
  inner join public.main_table as rel_217088662
  on (rel_217088661.unique2 = rel_217088662.a )
on (rel_217088554.b = rel_217088660.id1 )
where rel_217088554.d = rel_217088554.c
fetch first 94 rows only;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-04 Thread Tom Lane
Piotr Stefaniak postg...@piotr-stefaniak.me writes:
 On 08/03/2015 09:18 PM, Tom Lane wrote:
 ... but I can't reproduce it on HEAD with either of these queries.
 Not clear why you're getting different results.

 I'm terribly sorry, but I didn't notice that postgresql.conf was modified...
 Set join_collapse_limit = 32 and you should see the error.

Ah ... now I get that error on the smaller query, but the larger one
(that you put in an attachment) still doesn't show any problem.
Do you have any other nondefault settings?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-04 Thread Ewan Higgs
Hi there,I've been following the sqlsmith work and wanted to jump in and try it 
out. I took Peter's idea and tried building postgres with the flags suggested 
but it was hard to get anything working. 

I'm on commit 85e5e222b1dd02f135a8c3bf387d0d6d88e669bd (Tue Aug 4 14:55:32 2015 
-0400)
Configure arguments:./configure --prefix=$HOME/pkg CC=clang CFLAGS='-O1 -g 
-fsanitize=address -fno-omit-frame-pointer -fno-optimize-sibling-calls' 
--enable-cassert

I had to make a simple leak suppression file:
$ cat leak.supp
leak:save_ps_display_args
leak:__GI___strdup
$ export LSAN_OPTIONS=suppressions=leak.supp
And then I could run postgres. After 50,000 queries, I'm left with the 
following report:

queries: 50514
AST stats (avg): height = 7.29877 nodes = 37.8156
296    ERROR:  canceling statement due to statement timeout
166    ERROR:  invalid regular expression: quantifier operand invalid
26     ERROR:  could not determine which collation to use for string comparison
23     ERROR:  cannot compare arrays of different element types
12     ERROR:  invalid regular expression: brackets [] not balanced
5      ERROR:  cache lookup failed for index 2619
2      ERROR:  invalid regular expression: parentheses () not balanced
error rate: 0.0104921

AddressSanitizer didn't fire except for the suppressed leaks. The suppressed 
leaks were only hit at the beginning:
-
Suppressions used:
  count  bytes template
  1    520 save_ps_display_args
  1 10 __GI___strdup
-


sqlsmith is a cool little piece of kit and I see a lot of room for on going 
work (performance bumps for more queries per second; more db back ends; 
different fuzzers). 

Yours,Ewan Higgs

 

 From: Peter Geoghegan p...@heroku.com
 To: Andreas Seltenreich seltenre...@gmx.de 
Cc: Pg Hackers pgsql-hackers@postgresql.org 
 Sent: Sunday, 2 August 2015, 10:39
 Subject: Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
   
On Fri, Jul 31, 2015 at 5:56 PM, Andreas Seltenreich seltenre...@gmx.de wrote:
 sqlsmith triggered the following assertion in master (c188204).

Thanks for writing sqlsmith. It seems like a great tool.

I wonder, are you just running the tool with assertions enabled when
PostgreSQL is built? If so, it might make sense to make various
problems more readily detected. As you may know, Clang has a pretty
decent option called AddressSanitizer that can detect memory errors as
they occur with an overhead that is not excessive. One might use the
following configure arguments when building PostgreSQL to use
AddressSanitizer:

./configure CC=clang CFLAGS='-O1 -g -fsanitize=address
-fno-omit-frame-pointer -fno-optimize-sibling-calls' --enable-cassert

Of course, it remains to be seen if this pays for itself. Apparently
the tool has about a 2x overhead [1]. I'm really not sure that you'll
find any more bugs this way, but it's certainly possible that you'll
find a lot more. Given your success in finding bugs without using
AddressSanitizer, introducing it may be premature.

[1] http://clang.llvm.org/docs/AddressSanitizer.html#introduction
-- 
Peter Geoghegan




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  

Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-03 Thread Andreas Seltenreich
Peter Geoghegan writes:

 On Fri, Jul 31, 2015 at 5:56 PM, Andreas Seltenreich seltenre...@gmx.de 
 wrote:
 sqlsmith triggered the following assertion in master (c188204).

 Thanks for writing sqlsmith. It seems like a great tool.

 I wonder, are you just running the tool with assertions enabled when
 PostgreSQL is built?

Right.  I have to admit my testing setup is still more tailored towards
testing sqlsmith than postgres.

 If so, it might make sense to make various problems more readily
 detected.  As you may know, Clang has a pretty decent option called
 AddressSanitizer that can detect memory errors as they occur with an
 overhead that is not excessive.

I didn't known this clang feature yet, thanks for pointing it out.  I
considered running some instances under valgrind to detect these, but
the performance penalty seemed not worth it.

 One might use the following configure arguments when building
 PostgreSQL to use AddressSanitizer:

 ./configure CC=clang CFLAGS='-O1 -g -fsanitize=address
 -fno-omit-frame-pointer -fno-optimize-sibling-calls' --enable-cassert

A quick attempt to sneak these in made my ansible playbooks unhappy due
to make check failures and other generated noise.  I'll try to have an
instance with the AddressSanitizer active soon though.

 Of course, it remains to be seen if this pays for itself. Apparently
 the tool has about a 2x overhead [1]. I'm really not sure that you'll
 find any more bugs this way, but it's certainly possible that you'll
 find a lot more. Given your success in finding bugs without using
 AddressSanitizer, introducing it may be premature.

Piotr also suggested on IRC to run coverage tests w/ sqlsmith.  This
could yield valuable hints in which direction to extend sqlsmith's
grammar.

Thanks,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-03 Thread Tom Lane
Piotr Stefaniak postg...@piotr-stefaniak.me writes:
 How about this one?
 1 ERROR:  could not find RelOptInfo for given relids

That would be a bug, for sure ...

 It's triggered on 13bba02271dce865cd20b6f49224889c73fed4e7 by this query 
 and the attached one:

... but I can't reproduce it on HEAD with either of these queries.
Not clear why you're getting different results.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-02 Thread Peter Geoghegan
On Fri, Jul 31, 2015 at 5:56 PM, Andreas Seltenreich seltenre...@gmx.de wrote:
 sqlsmith triggered the following assertion in master (c188204).

Thanks for writing sqlsmith. It seems like a great tool.

I wonder, are you just running the tool with assertions enabled when
PostgreSQL is built? If so, it might make sense to make various
problems more readily detected. As you may know, Clang has a pretty
decent option called AddressSanitizer that can detect memory errors as
they occur with an overhead that is not excessive. One might use the
following configure arguments when building PostgreSQL to use
AddressSanitizer:

./configure CC=clang CFLAGS='-O1 -g -fsanitize=address
-fno-omit-frame-pointer -fno-optimize-sibling-calls' --enable-cassert

Of course, it remains to be seen if this pays for itself. Apparently
the tool has about a 2x overhead [1]. I'm really not sure that you'll
find any more bugs this way, but it's certainly possible that you'll
find a lot more. Given your success in finding bugs without using
AddressSanitizer, introducing it may be premature.

[1] http://clang.llvm.org/docs/AddressSanitizer.html#introduction
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-02 Thread Tom Lane
Piotr Stefaniak postg...@piotr-stefaniak.me writes:
 On 08/01/2015 05:59 PM, Tom Lane wrote:
 Well, I certainly think all of these represent bugs:
 1 | ERROR:  could not find pathkey item to sort

 sqlsmith was able to find these two queries that trigger the error on my 
 machine:

Hmm ... I see no error with these queries as of today's HEAD or
back-branch tips.  I surmise that this was triggered by one of the other
recently-fixed bugs, though the connection isn't obvious offhand.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-01 Thread Andreas Seltenreich
Tom Lane writes:

 Well, I certainly think all of these represent bugs:

 [...]

thanks for priorizing them.  I'll try to digest them somewhat before
posting.

 This one's pretty darn odd, because 2619 is pg_statistic and not an index
 at all:

  4 | ERROR:  cache lookup failed for index 2619

This is actually the one from the README :-).  Quoting to spare media
discontinuity:

--8---cut here---start-8---
Taking a closer look at it reveals that it happens when you query a
certain catalog view like this:

  self=# select indexdef from pg_catalog.pg_indexes where indexdef is not NULL;
  FEHLER:  cache lookup failed for index 2619

This is because the planner then puts pg_get_indexdef(oid) in a context
where it sees non-index-oids, which causes it to croak:

 QUERY PLAN

 Hash Join  (cost=17.60..30.65 rows=9 width=4)
   Hash Cond: (i.oid = x.indexrelid)
   -  Seq Scan on pg_class i  (cost=0.00..12.52 rows=114 width=8)
 Filter: ((pg_get_indexdef(oid) IS NOT NULL) AND (relkind = 
'i'::char))
   -  Hash  (cost=17.31..17.31 rows=23 width=4)
 -  Hash Join  (cost=12.52..17.31 rows=23 width=4)
   Hash Cond: (x.indrelid = c.oid)
   -  Seq Scan on pg_index x  (cost=0.00..4.13 rows=113 width=8)
   -  Hash  (cost=11.76..11.76 rows=61 width=8)
 -  Seq Scan on pg_class c  (cost=0.00..11.76 rows=61 
width=8)
   Filter: (relkind = ANY ('{r,m}'::char[]))
--8---cut here---end---8---

thanks,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-01 Thread Tom Lane
Andreas Seltenreich seltenre...@gmx.de writes:
 Tom Lane writes:
 What concerns me more is that what you're finding is only cases that trip
 an assertion sanity check.  It seems likely that you're also managing to
 trigger other bugs with less drastic consequences, such as could not
 devise a query plan failures or just plain wrong answers.

 Ja, some of these are logged as well[1], but most of them are really as
 undrastic as can get, and I was afraid reporting them would be more of a
 nuisance.

Well, I certainly think all of these represent bugs:

  3 | ERROR:  plan should not reference subplan's variable
  2 | ERROR:  failed to assign all NestLoopParams to plan nodes
  1 | ERROR:  could not find pathkey item to sort

This I'm not sure about; it could be that the query gave conflicting
collation specifiers, but on the other hand we've definitely had bugs
with people forgetting to run assign_query_collations on subexpressions:

   4646 | ERROR:  could not determine which collation to use for string 
 comparison

This one's pretty darn odd, because 2619 is pg_statistic and not an index
at all:

  4 | ERROR:  cache lookup failed for index 2619

These seem likely to be bugs as well, though maybe they are race
conditions during a DROP and not worth fixing:

   1171 | ERROR:  cache lookup failed for index 16862
172 | ERROR:  cache lookup failed for index 257148
 84 | ERROR:  could not find member 1(34520,34520) of opfamily 1976
 55 | ERROR:  missing support function 1(34516,34516) in opfamily 1976
 13 | ERROR:  could not find commutator for operator 34538
  2 | ERROR:  cache lookup failed for index 12322

I would say anything of the sort that is repeatable definitely deserves
investigation, because even if it's an expectable error condition, we
should be throwing a more user-friendly error message.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-01 Thread Andreas Seltenreich
Tom Lane writes:

 What concerns me more is that what you're finding is only cases that trip
 an assertion sanity check.  It seems likely that you're also managing to
 trigger other bugs with less drastic consequences, such as could not
 devise a query plan failures or just plain wrong answers.

Ja, some of these are logged as well[1], but most of them are really as
undrastic as can get, and I was afraid reporting them would be more of a
nuisance.  I analysed a couple of the cache lookup failures, and they
all had a similar severreness than the example in the README[2].  The
operator ones I analysed seem due to intentionally broken operators in
the regression db.  The NestLoopParams and subplan reference one sound
interesting though…

 I'm not sure how we could identify wrong answers automatically :-(

Csmith isn't doing this either.  They discuss differential testing
though in their papers, i.e., comparing the results of different
products.  Maybe a simple metric like numbers of rows returned might
already be valuable for correctness checks.

I also thought about doing some sampling on the data and simulating
relational operations and check for witness tuples, but it is probably
not appropriate to start implementing a mini-rdbms on the client side.

 but it might be worth checking for XX000 SQLSTATE responses, since
 generally that should be a can't-happen case.  (Or if it can happen,
 we need to change the errcode.)

The sqlstate is currently missing in the reports because libpqxx is not
putting it in it's exceptions :-/.

regards,
Andreas

Footnotes: 
[1]  smith=# select * from report24h;
 count |  error 
  
---+--
 43831 | ERROR:  unsupported XML feature
 39496 | ERROR:  invalid regular expression: quantifier operand invalid
 27261 | ERROR:  canceling statement due to statement timeout
 21386 | ERROR:  operator does not exist: point = point
  8580 | ERROR:  cannot compare arrays of different element types
  5019 | ERROR:  invalid regular expression: brackets [] not balanced
  4646 | ERROR:  could not determine which collation to use for string 
comparison
  2583 | ERROR:  invalid regular expression: nfa has too many states
  2248 | ERROR:  operator does not exist: xml = xml
  1198 | ERROR:  operator does not exist: polygon = polygon
  1171 | ERROR:  cache lookup failed for index 16862
   677 | ERROR:  invalid regular expression: parentheses () not balanced
   172 | ERROR:  cache lookup failed for index 257148
84 | ERROR:  could not find member 1(34520,34520) of opfamily 1976
55 | ERROR:  missing support function 1(34516,34516) in opfamily 1976
42 | ERROR:  operator does not exist: city_budget = city_budget
13 | ERROR:  could not find commutator for operator 34538
10 | ERROR:  could not identify a comparison function for type xid
 4 | Connection to database failed
 4 | ERROR:  cache lookup failed for index 2619
 3 | ERROR:  plan should not reference subplan's variable
 2 | ERROR:  cache lookup failed for index 12322
 2 | ERROR:  failed to assign all NestLoopParams to plan nodes
 2 | ERROR:  invalid regular expression: invalid character range
 1 | ERROR:  could not find pathkey item to sort
(25 rows)
Time: 1158,990 ms

[2]  https://github.com/anse1/sqlsmith/blob/master/README.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-07-31 Thread Tom Lane
Andreas Seltenreich seltenre...@gmx.de writes:
 sqlsmith triggered the following assertion in master (c188204).

 TRAP: FailedAssertion(!(!bms_overlap(joinrelids, sjinfo-min_lefthand)), 
 File: joinrels.c, Line: 500)

Cool, I'll take a look.

 As usual, the query is against the regression database.  It is rather
 unwieldy… I wonder if I should stop working on new grammar rules and
 instead work on some post-processing that prunes the AST as much as
 possible while maintaining the failure mode.

Probably not really worth the trouble; I find it's usually easy to
produce a minimized test case after the failure cause is understood.

What concerns me more is that what you're finding is only cases that trip
an assertion sanity check.  It seems likely that you're also managing to
trigger other bugs with less drastic consequences, such as could not
devise a query plan failures or just plain wrong answers.  I'm not sure
how we could identify wrong answers automatically :-( but it might be
worth checking for XX000 SQLSTATE responses, since generally that should
be a can't-happen case.  (Or if it can happen, we need to change the
errcode.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-07-31 Thread Andreas Seltenreich
Hi,

sqlsmith triggered the following assertion in master (c188204).

TRAP: FailedAssertion(!(!bms_overlap(joinrelids, sjinfo-min_lefthand)), 
File: joinrels.c, Line: 500)

As usual, the query is against the regression database.  It is rather
unwieldy… I wonder if I should stop working on new grammar rules and
instead work on some post-processing that prunes the AST as much as
possible while maintaining the failure mode.

regards,
andreas

select
  subq_647409.c0 as c0,
  subq_647409.c0 as c1
from
  public.customer as rel_4116461
  left join public.clstr_tst_s as rel_4116555
  left join information_schema.columns as rel_4116556
  on (rel_4116555.rf_a = rel_4116556.ordinal_position )
right join pg_catalog.pg_roles as rel_4116557
on (rel_4116556.maximum_cardinality = rel_4116557.rolconnlimit )
  on (rel_4116461.passwd = rel_4116557.rolpassword )
left join (select
subq_647410.c8 as c0
  from
public.char_tbl as rel_4116611,
lateral (select
  rel_4116612.name as c0,
  rel_4116612.comment as c1,
  rel_4116612.nslots as c2,
  rel_4116612.comment as c3,
  rel_4116612.nslots as c4,
  rel_4116612.nslots as c5,
  rel_4116612.comment as c6,
  rel_4116612.comment as c7,
  rel_4116612.nslots as c8,
  rel_4116612.nslots as c9,
  rel_4116612.nslots as c10
from
  public.hub as rel_4116612
where rel_4116612.comment ~=~ rel_4116612.comment
fetch first 116 rows only) as subq_647410
  where (subq_647410.c7 !~~* subq_647410.c7)
or ((subq_647410.c3 = subq_647410.c3)
  and ((subq_647410.c7 ~* subq_647410.c6)
and (subq_647410.c7 @@ subq_647410.c7)))
  fetch first 152 rows only) as subq_647409
  inner join public.int4_tbl as rel_4116661
  inner join public.shoe as rel_4116662
  on (rel_4116661.f1 = rel_4116662.sh_avail )
inner join public.rtest_vview3 as rel_4116663
on (rel_4116661.f1 = rel_4116663.a )
  on (subq_647409.c0 = rel_4116662.sh_avail )
on (rel_4116555.b = rel_4116661.f1 )
where ((rel_4116557.rolvaliduntil is NULL)
or (rel_4116663.b !~ rel_4116461.name))
  or (rel_4116661.f1 is not NULL)
fetch first 80 rows only;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers