SPI_cursor_fetch Memory overrun

2018-08-22 Thread Wu Ivy
Hi all,

I’m creating a C extension to write data from the Postgres  to a text file . 
Since the table can be very large, I only want to process one chunk of data per 
time to keep the memory stable. To achieve this, I use SPI  
cursor(https://www.postgresql.org/docs/current/static/spi-spi-cursor-fetch.html 
) to 
fetch row chunks from the table. Following is my program:

char* command;
uint64 proc = 1;

// Prepare a statement without executing it yet
SPI_connect();
command = psprintf("SELECT * FROM %s", tableName);
SPIPlanPtr SPIplan = SPI_prepare_cursor(command, 0, NULL, 0);
pfree(command);

// Set up a cursor using a statement created with SPI_prepare_cursor
Portal cursor= SPI_cursor_open(NULL, SPIplan, NULL, NULL, true);
if (SPIplan == NULL){
elog(ERROR,"couldn't create cursor via SPI");
}

// Writing to local Desktop
clock_t begin = clock();
int count = 0;
FILE *fp = fopen("/home/ubuntu/test.txt", "w");
if(fp == NULL){
return psprintf("Error when open file");
}

while(proc > 0){
// Fetch 20 rows from a cursor
SPI_cursor_fetch(cursor, true, 20);
proc = SPI_processed; // number of row returned

if (proc != 0){
TupleDesc tupdesc = SPI_tuptable->tupdesc;
uint64 j;

for (j = 0; j < proc; j++){
HeapTuple tuple = tuptable->vals[j];
int i;
count++;
for (i = 1; i <= tupdesc->natts; i++){ //natts -> number of 
columns
 fprintf(fp, (i == tupdesc->natts) ? "%s\n" : "%s," , 
SPI_getvalue(tuple, tupdesc, i)); 
}
}
}else{
break;
}

SPI_freetuptable(SPI_tuptable);
elog(INFO, “freed tuptable”);
sleep(5);
}


From my understanding, cursor points at the entire result rows on heap. After 
fetching certain number of rows, SPI_tuptable points to the allocated row set. 
After finishing writing the row set to text file,  I freed the memory of it by 
calling SPI_freetuptable( ) before next fetch.  I expected the memory to stay 
around a constant value through out the program, However, the actual memory 
kept increasing when I run the program.  Also, I only observed memory drop 
after SPI_freetuptable() of first chunk. After that memory kept going up even  
SPI_freetuptable()  is executed.  Any clue of why is it happening?  

Thanks in advance!

Best,
Ivy

Re: Getting NOT NULL constraint from pg_attribute

2018-08-21 Thread Wu Ivy
Thanks for the response. Really appreciate it!

Regards,
Ivy

2018-08-20 10:40 GMT-07:00 David G. Johnston :

> On Monday, August 20, 2018, Wu Ivy  wrote:
>>
>> Thanks for the quick respond.
>> Why are SELECT query never marked nullable? For nullable columns, when I
>> call SPI_getvalue(), the result (in char*) is NULL. I don’t think I’m too
>> clear on the definition of *attnotnull*. Can you give me a example in
>> which the tupleTable is can be marked nullable?
>> Also, is there any other ways to get nullability of each column while
>> getting the data from SPI_cursor_fetch? The only way I can think is to call
>> another separate command to query the table schema, but it will be in a
>> separate transaction in that case.
>>
>
> Basically the nullability property is used by the planner for optimization
> during the joining of physical tables.  As soon as you try outputting
> columns the ability to enforce not null goes away because of, in
> particular, outer joins.  While some changes could maybe be made the
> cost-benefit to do so doesn't seem favorable.
>
> David J.
>
>


Re: Getting NOT NULL constraint from pg_attribute

2018-08-20 Thread Wu Ivy
Hi tom,

Thanks for the quick respond.
Why are SELECT query never marked nullable? For nullable columns, when I call 
SPI_getvalue(), the result (in char*) is NULL. I don’t think I’m too clear on 
the definition of attnotnull. Can you give me a example in which the tupleTable 
is can be marked nullable?
Also, is there any other ways to get nullability of each column while getting 
the data from SPI_cursor_fetch? The only way I can think is to call another 
separate command to query the table schema, but it will be in a separate 
transaction in that case.

Thank you again!
Best,
Ivy
> On Aug 17, 2018, at 6:02 PM, Tom Lane  wrote:
> 
> Wu Ivy  writes:
>> I’m currently building a Postgres C extension that fetch data from a 
>> Postgres table.
>> Since the table can be large, in order to prevent memory overrun, I use 
>> SPI_cursor_fetch to fetch chunks of data. The result rows are saved in 
>> SPITupleTable* SPI_tuptable and attributes are saved in 
>> SPI_tuptable->tupdesc. 
>> In order to process my data, I need to get information of column nullability 
>> (whether column has NOT NULL constrain). I can get this information by 
>> calling:
> 
>>  TupleDesc tupdesc = SPI_tuptable->tupdesc;
>>  bool is_nullable = TupleDescAttr(tupdesc, column_num - 1) -> attnotnull;
>> However, the result (is_nullable) is always 0, meaning the column does not 
>> have NOT NULLl constraint, even for columns that do have the NOT NULL 
>> constraint.
> 
> The output columns of a SELECT query are never marked nullable, regardless
> of what the source data was.
> 
>   regards, tom lane



Getting NOT NULL constraint from pg_attribute

2018-08-17 Thread Wu Ivy
Hi developers,

I’m currently building a Postgres C extension that fetch data from a Postgres 
table.
Since the table can be large, in order to prevent memory overrun, I use 
SPI_cursor_fetch to fetch chunks of data. The result rows are saved in 
SPITupleTable* SPI_tuptable and attributes are saved in SPI_tuptable->tupdesc. 
In order to process my data, I need to get information of column nullability 
(whether column has NOT NULL constrain). I can get this information by calling:

TupleDesc tupdesc = SPI_tuptable->tupdesc;
bool is_nullable = TupleDescAttr(tupdesc, column_num - 1) -> attnotnull;
However, the result (is_nullable) is always 0, meaning the column does not have 
NOT NULLl constraint, even for columns that do have the NOT NULL constraint.

Any idea of why is it happening? 
Thanks in advance!

Best,
Ivy