On Thu, Dec 7, 2023 at 1:05 PM Sutou Kouhei <k...@clear-code.com> wrote:
>
> Hi,
>
> In <CAEG8a3LSRhK601Bn50u71BgfNWm4q3kv-o-KEq=hrbylby_...@mail.gmail.com>
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Wed, 6 Dec 2023 22:07:51 +0800,
>   Junwang Zhao <zhjw...@gmail.com> wrote:
>
> > Should we extract both *copy to* and *copy from* for the first step, in that
> > case we can add the pg_copy_handler catalog smoothly later.
>
> I don't object it (mixing TO/FROM changes to one patch) but
> it may make review difficult. Is it acceptable?
>
> FYI: I planed that I implement TO part, and then FROM part,
> and then unify TO/FROM parts if needed. [1]

I'm fine with step by step refactoring, let's just wait for more
suggestions.

>
> > Attached V4 adds 'extract copy from' and it passed the cirrus ci,
> > please take a look.
>
> Thanks. Here are my comments:
>
> > +             /*
> > +                     * Error is relevant to a particular line.
> > +                     *
> > +                     * If line_buf still contains the correct line, print 
> > it.
> > +                     */
> > +             if (cstate->line_buf_valid)
>
> We need to fix the indentation.
>
> > +CopyFromFormatBinaryStart(CopyFromState cstate, TupleDesc tupDesc)
> > +{
> > +     FmgrInfo   *in_functions;
> > +     Oid                *typioparams;
> > +     Oid                     in_func_oid;
> > +     AttrNumber      num_phys_attrs;
> > +
> > +     /*
> > +      * Pick up the required catalog information for each attribute in the
> > +      * relation, including the input function, the element type (to pass 
> > to
> > +      * the input function), and info about defaults and constraints. 
> > (Which
> > +      * input function we use depends on text/binary format choice.)
> > +      */
> > +     num_phys_attrs = tupDesc->natts;
> > +     in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo));
> > +     typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid));
>
> We need to update the comment because defaults and
> constraints aren't picked up here.
>
> > +CopyFromFormatTextStart(CopyFromState cstate, TupleDesc tupDesc)
> ...
> > +     /*
> > +      * Pick up the required catalog information for each attribute in the
> > +      * relation, including the input function, the element type (to pass 
> > to
> > +      * the input function), and info about defaults and constraints. 
> > (Which
> > +      * input function we use depends on text/binary format choice.)
> > +      */
> > +     in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo));
> > +     typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid));
>
> ditto.
>
>
> > @@ -1716,15 +1776,6 @@ BeginCopyFrom(ParseState *pstate,
> >               ReceiveCopyBinaryHeader(cstate);
> >       }
>
> I think that this block should be moved to
> CopyFromFormatBinaryStart() too. But we need to run it after
> we setup inputs such as data_source_cb, pipe and filename...
>
> +/* Routines for a COPY HANDLER implementation. */
> +typedef struct CopyHandlerOps
> +{
> +       /* Called when COPY TO is started. This will send a header. */
> +       void            (*copy_to_start) (CopyToState cstate, TupleDesc 
> tupDesc);
> +
> +       /* Copy one row for COPY TO. */
> +       void            (*copy_to_one_row) (CopyToState cstate, 
> TupleTableSlot *slot);
> +
> +       /* Called when COPY TO is ended. This will send a trailer. */
> +       void            (*copy_to_end) (CopyToState cstate);
> +
> +       void            (*copy_from_start) (CopyFromState cstate, TupleDesc 
> tupDesc);
> +       bool            (*copy_from_next) (CopyFromState cstate, ExprContext 
> *econtext,
> +                                                                  Datum 
> *values, bool *nulls);
> +       void            (*copy_from_error_callback) (CopyFromState cstate);
> +       void            (*copy_from_end) (CopyFromState cstate);
> +} CopyHandlerOps;
>
> It seems that "copy_" prefix is redundant. Should we use
> "to_start" instead of "copy_to_start" and so on?
>
> BTW, it seems that "COPY FROM (FORMAT json)" may not be implemented. [2]
> We may need to care about NULL copy_from_* cases.
>
>
> > I added a hook *copy_from_end* but this might be removed later if not used.
>
> It may be useful to clean up resources for COPY FROM but the
> patch doesn't call the copy_from_end. How about removing it
> for now? We can add it and call it from EndCopyFrom() later?
> Because it's not needed for now.
>
> I think that we should focus on refactoring instead of
> adding a new feature in this patch.
>
>
> [1]: 
> https://www.postgresql.org/message-id/20231204.153548.2126325458835528809.kou%40clear-code.com
> [2]: 
> https://www.postgresql.org/message-id/flat/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com
>
>
> Thanks,
> --
> kou



-- 
Regards
Junwang Zhao


Reply via email to