Re: [HACKERS] How to get the 'ctid' from a record type?

2017-03-11 Thread Eric Ridge
On Sat, Mar 11, 2017 at 2:14 PM Andres Freund  wrote:

> On 2017-03-11 04:31:16 +, Eric Ridge wrote:
> > Well shoot.  That kinda spoils my plans.
>
> I think you should elaborate on what you're trying to achieve -
> otherwise our advice will be affected by the recent, widely reported,
> crystal ball scarcity.
>

What I'm trying to do is port https://github.com/zombodb/zombodb to
Postgres 9.6+.  It's an Access Method that stores full rows, encoded as
JSON, in Elasticsearch instead of in local storage.  It was fairly
straightforward to do the mechanical work to convert it to use 9.6's new AM
API (which is very nice, btw!), but the fact that 9.6 also disallows
including system columns (specifically ctid) has me turned upside down.

With <9.6, I was able to cook-up a scheme where it was able to answer
queries from the remote Elasticsearch index even when Postgres decided to
plan a sequential scan.  That hinged, mostly, on being able to create a
multi-column index where the first column was a function call that included
as an argument (among other things) the ctid system column.

The ability to answer sequential scans (and filters) using the remote ES
index is pretty important as the knowledge of how to do that exists in
Elasticsearch, not my custom operator function in Postgres.

Anyways, I've been trying to find a way to intuit the ctid system column
value with 9.6 and it's clear now that that just isn't possible.  The
closest I got was digging through
ActivePortal->queryDesc->estate->es_tuple, but that only works when it's a
real tuple, not one that's virtual or minimal.

I'm pretty sure that I need to be implementing a Custom Scan Provider
instead, and I've been spending time with that API too.  There's a pretty
steep learning curve for me, but I'll eventually get over that hump.

I could probably bore you with greater detail but basically, I want to take:
   CREATE INDEX idxfoo ON table USING zombodb (zdb(table),
zdb_to_json(table)) WITH (url='http://remote.ip.addr:9200/');
   SELECT * FROM table WHERE zdb(table) ==> 'some full text query' OR id =
42;

And have the "zdb(table) ==> 'some full text query'" bit be answered by my
extension, regardless of how PG wants to plan the query.  While I was able
to hack something together for <9.6, I think that means a Custom Scan
Provider now?

eric


Re: [HACKERS] How to get the 'ctid' from a record type?

2017-03-11 Thread Andres Freund
On 2017-03-11 04:31:16 +, Eric Ridge wrote:
> Well shoot.  That kinda spoils my plans.

I think you should elaborate on what you're trying to achieve -
otherwise our advice will be affected by the recent, widely reported,
crystal ball scarcity.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to get the 'ctid' from a record type?

2017-03-11 Thread Jim Nasby

On 3/10/17 10:31 PM, Eric Ridge wrote:

What about this?  Is the tuple currently being evaluated (I suppose in
the case of a sequential scan) available in the context of a function call?


AFAIK that *very* specific case would work, because the executor would 
be handing you the raw tuple. Not a great bet to make though. Also, 
there should be a macro somewhere that will tell you whether you have a 
full tuple or not. You'd want to make sure to check that an throw an 
error if you weren't handed a full tuple.

--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to get the 'ctid' from a record type?

2017-03-10 Thread Eric Ridge
>
> I suspect the tuple at (0,1) has been the subject of a failed update.
>

Yep.


> Your problem here is that you're mistaking the t_ctid field of a tuple
> header for the tuple's address.  It is not that; it's really just garbage
> normally, and is only useful to link forward to the next version of the
> row from an outdated tuple.  I think we do initialize it to the tuple's
> own address during an INSERT, but either a completed or failed UPDATE
> would change it.
>

Thanks.  That helps clarify the comments in htup_details.h, actually.


> I do not think there is any way to get the true address of a heap tuple
> out of a composite Datum manufactured from the tuple.  Most of the other
> system columns can't be gotten from a composite Datum either, because of
> the field overlay in HeapTupleHeaderData's union t_choice.


Well shoot.  That kinda spoils my plans.

What about this?  Is the tuple currently being evaluated (I suppose in the
case of a sequential scan) available in the context of a function call?

Thanks again for your time!  It's much appreciated.

eric


Re: [HACKERS] How to get the 'ctid' from a record type?

2017-03-10 Thread Tom Lane
Eric Ridge  writes:
> What I'm seeing is that the ctid returned from this function isn't always
> correct:

> # select ctid, foo(table) from table limit 10;
>  ctid  |foo
> ---+---
>  (0,1) | (19195,1)-- not correct!
>  (0,2) | (0,2)
>  (0,3) | (0,3)

I suspect the tuple at (0,1) has been the subject of a failed update.

Your problem here is that you're mistaking the t_ctid field of a tuple
header for the tuple's address.  It is not that; it's really just garbage
normally, and is only useful to link forward to the next version of the
row from an outdated tuple.  I think we do initialize it to the tuple's
own address during an INSERT, but either a completed or failed UPDATE
would change it.

I do not think there is any way to get the true address of a heap tuple
out of a composite Datum manufactured from the tuple.  Most of the other
system columns can't be gotten from a composite Datum either, because of
the field overlay in HeapTupleHeaderData's union t_choice.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] How to get the 'ctid' from a record type?

2017-03-10 Thread Eric Ridge
This is about Postgres 9.6...

I have a very simple 1-arg function, in C, that I want to return the ctid
of the record passed in.  Example:

CREATE OR REPLACE FUNCTION foo(record) RETURNS tid LANGUAGE c IMMUTABLE
STRICT AS 'my_extension';

Its implementation is simply:

Datum foo(PG_FUNCTION_ARGS) {
HeapTupleHeader td = PG_GETARG_HEAPTUPLEHEADER(0);

PG_RETURN_POINTER(>t_ctid);
}

What I'm seeing is that the ctid returned from this function isn't always
correct:

# select ctid, foo(table) from table limit 10;
 ctid  |foo
---+---
 (0,1) | (19195,1)-- not correct!
 (0,2) | (0,2)
 (0,3) | (0,3)
 (0,4) | (0,4)
 (0,5) | (0,5)
 (0,6) | (0,6)
 (0,7) | (0,7)
 (1,1) | (1,1)
 (1,2) | (1,2)
 (1,3) | (1,3)
(10 rows)

I've spent hours tracing through the PG sources trying to figure out why
and/or find a different way to do this, but I've got nothing.

Of the various examples in the PG sources that use
PG_GETARG_HEAPTUPLEHEADER, or otherwise deal with a HeapTupleHeader, I
can't find any that want to access a system column, so I'm starting to
think I'm just doing it wrong entirely.

Can anyone point me in the right direction?

Thanks for your time!

eric