2013/11/25 Heikki Linnakangas <hei...@localhost.vmware.com>

> On 07.10.2013 17:00, Pavel Stehule wrote:
>
>> Hello
>>
>> I fixed patch - there was a missing cast to domain when it was used (and
>> all regress tests are ok now).
>>
>
> This doesn't work correctly for varlen datatypes. I modified your
> quicksort example to work with text[], and got this:
>

> postgres=#  SELECT array_upper(quicksort(1,20000,array_agg((g::text))),1)
> FROM generate_series(1,20000) g;
> ERROR:  invalid input syntax for integer: " "
> CONTEXT:  PL/pgSQL function quicksort(integer,integer,text[]) line 10 at
> assignment
> PL/pgSQL function quicksort(integer,integer,text[]) line 16 at assignment
>
> The handling of array domains is also wrong. You replace the new value to
> the array and only check the domain constraint after that. If the
> constraint is violated, it's too late to roll that back. For example:
>

both are bug, and should be fixed


>
> postgres=# create domain posarray as int[] check (value[1] > 0);
> CREATE DOMAIN
> postgres=# do $$
> declare
>   iarr posarray;
> begin
>   begin
>     iarr[1] := 1;
>     iarr[1] := -1;
>   exception when others then raise notice 'got error: %', sqlerrm; end;
>   raise notice 'value: %', iarr[1];
> end;
> $$;
> NOTICE:  got error: value for domain posarray violates check constraint
> "posarray_check"
> NOTICE:  value: -1
> DO
>
> Without the patch, it prints 1, but with the patch, -1.
>
>
> In general, I'm not convinced this patch is worth the trouble. The speedup
> isn't all that great; manipulating large arrays in PL/pgSQL is still so
> slow that if you care about performance you'll want to rewrite your
> function in some other language or use temporary tables. And you only get a
> gain with arrays of fixed-length element type with no NULLs.
>

I understand to your objection, but I disagree so particular solution is
not enough. Update of fixed array should be fast - for this task there are
no any workaround in plpgsql, so using a array as fast cache for
calculation has some (sometimes very high) hidden costs. A very advanced
users can use a different PL or can prepare C extensions, but for common
users, this cost is hidden and unsolvable.


>
> So I think we should drop this patch in its current form. If we want to
> make array manipulation in PL/pgSQL, I think we'll have to do something
> similar to how we handle "row" variables, or something else entirely. But
> that's a much bigger patch, and I'm not sure it's worth the trouble either.
>

Sorry, I disagree again - using same mechanism for arrays of fixed types
and arrays of varlena types should be ineffective.  Slow part of this area
is manipulation with large memory blocks - there we fighting with physical
CPU and access to memory limits.

Now, I know about few significant PL/pgSQL issues related to manipulation
with variables

* repeated detoast
* missing preallocation
* low effectiveness manipulation with large arrays
* low effectiveness with casting (using IO casting instead binary)

A final solution of these issues needs redesign of work with variables in
PL/pgSQL - some in Perl style (maybe). It can be nice, but I don't know
somebody who will work on this early.

I would to concentrate to integration of plpgsql_check to core - and maybe
later I can work on this topic (maybe after 2 years).

So any particular solution (should not be mine) can serve well two three
years. This code is 100% hidden on top, and we can change it safely, so
particular solution can be enough.

Regards

Pavel


> Looking at perf profile and the functions involved, though, I see some
> low-hanging fruit: in array_set, the new array is allocated with palloc0'd,
> but we only need to zero out a few alignment bytes where the new value is
> copied. Replacing it with palloc() will save some cycles, per the attached
> patch. Nowhere near as much as your patch, but this is pretty
> uncontroversial.
>
> - Heikki
>
>

Reply via email to