Re: [GENERAL] When inserting from a SELECT with an ORDER BY, are the inserts (and associated triggers) applied in order?

2017-06-26 Thread Jim Fulton
On Mon, Jun 26, 2017 at 5:43 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Jim Fulton <j...@jimfulton.info> writes:
> > When inserting from a SELECT with an ORDER BY, are the inserts (and
> > associated triggers) applied in order?
>
> Yeah, I'd expect so.  I'm not sure we'd promise that that will always
> remain true, but I can't think why it would be violated at the moment.
>
> > It looks like inserts aren't applied in order, and I'm wondering if this
> is
> > something I should expect.
>
> Hard to comment on that without seeing your test case.
>

Yup.  This works as I'd expect in my test case.  It's in the wild that I'm
having trouble. :(

At the risk of TMI (don't feel obliged to follow), I have a database with a
JSONB column that represents object data (http://newtdb.org). I have an
application in which the data are hierarchically organized.  At the
(almost) top level are "communities". I want to be able to search by
community and I want the search to be indexed on community id.  In this
application, ids for ancestor objects are always lower than ids of
descendents.  While objects may rarely move around in the hierarchy, their
communities never change.  Objects are sometimes created in the same
transaction as their parents. I use a trigger to find and copy community
ids into the JSONB data records and then index the JSONB properties.

A test Python script that simulates this:
https://gist.github.com/jimfulton/317e36e6f74c309ee9198f453c41ab59. Note
that objects are initially copied to a staging table and then copied in
mass to the data table. If the test script is run, all of the records have
"cid" properties set properly. If I remove the "order by" on line 80, then
some record end up without "cid" properties.

The non-test case is a bit more complicated.  If you're curious:

   - the insert logic:
   https://github.com/newtdb/db/blob/master/src/newt/db/_adapter.py#L67
   It uses upsert rather than deletion+insert to do updates.
   - The trigger and function for finding community ids:
   https://github.com/karlproject/karl/blob/master/karl/scripts/pgevolve.py#L327

Jim

-- 
Jim Fulton
http://jimfulton.info


[GENERAL] When inserting from a SELECT with an ORDER BY, are the inserts (and associated triggers) applied in order?

2017-06-26 Thread Jim Fulton
When inserting from a SELECT with an ORDER BY, are the inserts (and
associated triggers) applied in order?

It looks like inserts aren't applied in order, and I'm wondering if this is
something I should expect.

Jim

-- 
Jim Fulton
http://jimfulton.info


Re: [GENERAL] Trigger problems/questions

2017-06-15 Thread Jim Fulton
On Thu, Jun 15, 2017 at 1:56 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Jim Fulton <j...@jimfulton.info> writes:
> > I have an object database that's mirrored to a table with data in a JSONB
> > column.  Data are organized into "communities".  Community ids aren't
> > stored directly in content but can be found by recursively following
> > __parent__ properties. I want to be able to index content records on
> their
> > community ids.
>
> > (I originally tried to index functions that got ids, but apparently lying
> > about immutability is a bad idea and I suffered the consequences. :-])
>
> > I tried creating a trigger to populate a community_zoid property with a
> > community id when a record is inserted or updated.  The trigger calls a
> > recursive functions to get the community id.
> > ...
> > This scheme succeeds most of the time, but occasionally, it fails.
>
> Since your original idea failed, I suppose that the parent relationships
> are changeable?


Good question.

A few kinds of objects can, rarely, move in the hierarchy, and, they never
move between communities, so their community id never changes.

IDK WTF my indexing attempt.  I could build the index, then add an object
to the tree and it wouldn't be indexed.  This was in a staging database
where there were no other changes.


> What mechanism have you got in place to propagate a
> relationship change back down to the child records?
>

This is a non-issue, at least WRT community ids.  If I were, for example,
to index paths, it would be an issue for some objects, but I'm not at that
point yet.


> Also, this looks to have a race condition: if you search for a record's
> community id at about the same time that someone else is changing the
> parent linkage, you may get the old answer, but by the time you commit the
> record update that answer may be obsolete.  This is a problem because even
> if you had another trigger that was trying (in the someone else's session)
> to propagate new community ids back to affected records, it wouldn't think
> that the record you're working on needs a change, because it would also
> see the old version of that record.
>
> Solutions to the race problem usually involve either SELECT FOR UPDATE
> to lock rows involved in identifying the target record's community ID,
> or use of SERIALIZABLE to cause the whole transaction to fail if its
> results might be inconsistent.  Either one will add some complexity
> to your application code.
>

There's a global lock around all of the updates to the table.  (This isn't
as unreasonable as it sounds :), but anyway, that's outside the scope of
this discussion.)

Even if there was some kind of race, I'd still get a community id set, it
might be wrong, but it would be set.

regards, tom lane
>

Thanks.

Jim

-- 
Jim Fulton
http://jimfulton.info


Re: [GENERAL] Trigger problems/questions

2017-06-15 Thread Jim Fulton
On Thu, Jun 15, 2017 at 1:44 PM, Adrian Klaver <adrian.kla...@aklaver.com>
wrote:

> On 06/15/2017 10:23 AM, Jim Fulton wrote:
>

...


> I do not pretend to fully understand what the the triggers/functions are
> really doing,

but I did notice this:
>
> create or replace function populate_community_zoid_triggerf()
>
> ...
>
> new_zoid := NEW.state ->> 'community_zoid';
> zoid := find_community_zoid(
> NEW.zoid, NEW.class_name, NEW.state)::text;
> if zoid is null then
> if new_zoid is not null then
> NEW.state := NEW.state - 'community_zoid';
> end if;
> else
> if new_zoid is null or zoid != new_zoid then
> NEW.state :=
> NEW.state || ('{"community_zoid": ' || zoid || '}')::jsonb;
> end if;
> end if;
>
> ...
>
>
> Not sure what happens if zoid is null and new_zoid is null also?
>

Then then no change is needed and nothing happens, which is correct.


> Or if it matters?


Nope.

Thanks for looking.

Jim

-- 
Jim Fulton
http://jimfulton.info


[GENERAL] Trigger problems/questions

2017-06-15 Thread Jim Fulton
I have an object database that's mirrored to a table with data in a JSONB
column.  Data are organized into "communities".  Community ids aren't
stored directly in content but can be found by recursively following
__parent__ properties. I want to be able to index content records on their
community ids.

(I originally tried to index functions that got ids, but apparently lying
about immutability is a bad idea and I suffered the consequences. :-])

I tried creating a trigger to populate a community_zoid property with a
community id when a record is inserted or updated.  The trigger calls a
recursive functions to get the community id.

Trigger: https://github.com/karlproject/karl/blob/master/
karl/scripts/pgevolve.py#L261

Trigger procedure: https://github.com/karlproject/karl/blob/master/
karl/scripts/pgevolve.py#L236

Function to find a community id: https://github.com/
karlproject/karl/blob/master/karl/scripts/pgevolve.py#L209

This scheme succeeds most of the time, but occasionally, it fails.

I can find records where it has failed with a query like:

  select zoid
  from newt
  where find_community_zoid(zoid, class_name, state) is not null
and not state ? 'community_zoid';

If I update the records where it has failed:

  update newt set class_name=class_name
  where find_community_zoid(zoid, class_name, state) is not null
 and not state ? 'community_zoid';

Then retry the query above, I get 0 rows back. This would seem to indicate
that the trigger is logically correct.

Questions:

   - Am I doing it wrong?  For example, is there some limitation on trigger
   procedures that I'm violating?
   - If a trigger procedure errors, is the transaction aborted? (I would
   hope so.)
   - Should I expect triggers to be reliable and rely on them for database
   integrity?
   - Any suggestions on how to debug this?

Jim

-- 
Jim Fulton
http://jimfulton.info