Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Peter Tanski
])) {
  leaf_split = false;
}
if (v == NULL) {
  elog(ERROR, entry %d is invalid, i);
}
raw_vec[j] = v;
vec_ixs[j++] = i;
  }
  if (n_entries  j) {
elog(WARNING, [%s:%s:%d]:  SIZE_T_FMT  bad entries,
  __FILE__, __func__, __LINE__, n_entries - j);
n_entries = j;
  } else if (n_entries  j) {
elog(ERROR, skipping %d entries, j-n_entries);
  }

So I know the number of entries sent to Picksplit() is 4, for 6 calls to 
decompress.

Note that Decompress() returns the input unchanged and entries are untoasted in 
the deserialize_fprint() function, which malloc's each value:

Datum fprint_decompress(PG_FUNCTION_ARGS) {
  GISTENTRY* entry = (GISTENTRY*)PG_GETARG_POINTER(0);

  FPDEBUG(entered decompress);

  if (!entry) {
elog(ERROR, fprint_decompress: entry is NULL);
  }

  // cut out here -- we handle the memory
  PG_RETURN_POINTER(entry);
}

I'll put together a test case and send that on.

On Nov 23, 2010, at 2:29 AM, Heikki Linnakangas wrote:

 On 22.11.2010 23:18, Peter Tanski wrote:
 Whatever test I use for Same(), Penalty() and Consistent() does not seem
 to affect the problem significantly. For now I am only using
 Consistent() as a check for retrieval.
 
 I believe it's not possible to lose leaf tuples with incorrectly defined gist 
 support functions. You might get completely bogus results, but the tuples 
 should be there when you look at gist_tree() output. So this sounds like a 
 gist bug to me.
 
 Note that there are only 133 leaf tuples -- for 500 rows. If the index
 process were operating correctly, there should have been 500 leaf tuples
 there. If I REINDEX the table the number of leaf tuples may change
 slightly but not by much.
 
 One idea for debugging is to insert the rows to the table one by one, and run 
 the query after each insertion. When do the leaf tuples disappear?
 
 If you can put together a small self-contained test case and post it to the 
 list, I can take a look.
 
 -- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Peter Tanski
I should correct what I just wrote: the first and last entries in  
entryvec-vector are invalid.


On Nov 23, 2010, at 11:39 AM, Peter Tanski wrote:

Picksplit() seems to be an exceptional case here: the first and last  
numbers in entryvec are invalid so


entryvec-vector[entryvec-n - 1]

is invalid.  All the other GiST code Picksplit() functions use the  
same convention.  For example, see the btree_gist picksplit  
function, at

http://doxygen.postgresql.org/btree__utils__num_8c-source.html#l00241

OffsetNumber i,
 maxoff = entryvec-n - 1;


On Nov 23, 2010, at 10:22 AM, Alvaro Herrera wrote:

Excerpts from Peter Tanski's message of mar nov 23 12:00:52 -0300  
2010:


There are checks inside the Picksplit() function for the number of  
entries:


OffsetNumber maxoff = entryvec-n - 1;
int n_entries, j;
n_entries = Max(maxoff, 1) - 1;

j = 0;
for (i = FirstOffsetNumber; i  maxoff; i = OffsetNumberNext(i)) {
  FPrint* v = deserialize_fprint(entv[i].key);


Isn't this off by one?  Offset numbers are 1-based, so the maxoff
computation is wrong.

--
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Peter Tanski
Picksplit() seems to be an exceptional case here: the first and last  
numbers in entryvec are invalid so


entryvec-vector[entryvec-n - 1]

is invalid.  All the other GiST code Picksplit() functions use the  
same convention.  For example, see the btree_gist picksplit function, at

http://doxygen.postgresql.org/btree__utils__num_8c-source.html#l00241

OffsetNumber i,
  maxoff = entryvec-n - 1;


On Nov 23, 2010, at 10:22 AM, Alvaro Herrera wrote:

Excerpts from Peter Tanski's message of mar nov 23 12:00:52 -0300  
2010:


There are checks inside the Picksplit() function for the number of  
entries:


 OffsetNumber maxoff = entryvec-n - 1;
 int n_entries, j;
 n_entries = Max(maxoff, 1) - 1;

 j = 0;
 for (i = FirstOffsetNumber; i  maxoff; i = OffsetNumberNext(i)) {
   FPrint* v = deserialize_fprint(entv[i].key);


Isn't this off by one?  Offset numbers are 1-based, so the maxoff
computation is wrong.

--
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Peter Tanski
On Nov 23, 2010, at 1:37 PM, Yeb Havinga wrote:
 j = 0;
 for (i = FirstOffsetNumber; i  maxoff; i = OffsetNumberNext(i)) {
   FPrint* v = deserialize_fprint(entv[i].key);
 
 Isn't this off by one?  Offset numbers are 1-based, so the maxoff
 computation is wrong.
 The first for loop of all others compare with i = maxoff instead of i  
 maxoff.

You are right: I am missing the last one, there.  (During a memory-debugging 
phase entv[entryvec-n - 1] was always invalid, probably as a memory overwrite 
error but I fixed that later and never changed it back.)

On the other hand, there are two problems:

1. the maximum size on a GiST page is 4240 bytes, so I cannot add a full-size 
Datum using this kind of hash-key setup (the base Datum size is 4230 bytes on a 
64-bit machine).  The example test cases I used were smaller in order to get 
around that issue: they are 2326 bytes base size.  

2. Even after fixing the Picksplit() loop, the dropped-leaf problem still 
manifests itself:

postgres=# set enable_seqscan=false;
SET
postgres=# set enable_indexscan=true;
SET
postgres=# create table fps2 (id serial, soid character(24) not null, 
fingerprint fprint not null);
NOTICE:  CREATE TABLE will create implicit sequence fps2_id_seq for serial 
column fps2.id
CREATE TABLE
postgres=# create index fps2_fingerprint_ix on fps2 using gist (fingerprint 
fprint_gist_ops);
CREATE INDEX
postgres=# \i xaa
psql:xaa:1: NOTICE:  [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# \i xab
psql:xab:1: NOTICE:  [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# \i xac
psql:xac:1: NOTICE:  [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# \i xad
psql:xad:1: NOTICE:  [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# select gist_stat('fps2_fingerprint_ix');
   gist_stat   
---
 Number of levels:  1 +
 Number of pages:   1 +
 Number of leaf pages:  1 +
 Number of tuples:  4 +
 Number of invalid tuples:  0 +
 Number of leaf tuples: 4 +
 Total size of tuples:  5628 bytes+
 Total size of leaf tuples: 5628 bytes+
 Total size of index:   8192 bytes+

postgres=# \i xae
psql:xae:1: NOTICE:  [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# select gist_stat('fps2_fingerprint_ix');
   gist_stat   
---
 Number of levels:  1 +
 Number of pages:   1 +
 Number of leaf pages:  1 +
 Number of tuples:  5 +
 Number of invalid tuples:  0 +
 Number of leaf tuples: 5 +
 Total size of tuples:  7032 bytes+
 Total size of leaf tuples: 7032 bytes+
 Total size of index:   8192 bytes+

postgres=# \i xaf
psql:xaf:1: NOTICE:  [pgfprint.c:fprint_compress:379] entered compress
psql:xaf:1: NOTICE:  [pgfprint.c:fprint_decompress:419] entered decompress
psql:xaf:1: NOTICE:  [pgfprint.c:fprint_decompress:419] entered decompress
psql:xaf:1: NOTICE:  [pgfprint.c:fprint_decompress:419] entered decompress
psql:xaf:1: NOTICE:  [pgfprint.c:fprint_decompress:419] entered decompress
psql:xaf:1: NOTICE:  [pgfprint.c:fprint_decompress:419] entered decompress
psql:xaf:1: NOTICE:  [pgfprint.c:fprint_decompress:419] entered decompress
psql:xaf:1: NOTICE:  [pgfprint.c:fprint_picksplit:659] entered picksplit
psql:xaf:1: NOTICE:  [pgfprint.c:fprint_picksplit:838] split: 3 left, 2 right
psql:xaf:1: NOTICE:  [pgfprint.c:fprint_compress:379] entered compress
psql:xaf:1: NOTICE:  [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# select gist_stat('fps2_fingerprint_ix');
   gist_stat

 Number of levels:  2  +
 Number of pages:   3  +
 Number of leaf pages:  2  +
 Number of tuples:  7  +
 Number of invalid tuples:  0  +
 Number of leaf tuples: 5  +
 Total size of tuples:  9864 bytes +
 Total size of leaf tuples: 7044 bytes +
 Total size of index:   24576 bytes+

postgres=# select id, soid from fps2;
 id |   soid   
+--
  1 | 4c65a39d4d9bca2c3382
  2 | 4c65a39d4d9bca2c338a
  3 | 4c65a39d4d9bca2c3390
  4 | 4c65a39d4d9bca2c3399
  5 | 4c65a39d4d9bca2c33a5
  6 | 4c65a39d4d9bca2c33a8

postgres=# select f1.id, f2.id, fprint_cmp(f1.fingerprint, f2.fingerprint) from 
fps2 f1 join fps2 f2 on f1.fingerprint=f2.fingerprint;
 id | id |fprint_cmp
++--
  1 |  1 | 1.00031467691569
  2 |  2 | 1.00031467691569
  4 |  4 | 1.00031467691569
  5 |  5 | 1.00031467691569
  6 |  6 | 1.00031467691569
 
So GiST does not include a tuple for row 3; one of the old tuples.  

After inserting a few more rows to trigger another Picksplit():

postgres=# \i xag

Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Peter Tanski
I found another off-by-one error in my Picksplit() algorithm and the GiST index 
contains one leaf tuple for each row in the table now.  The error was to start 
from 1 instead of 0 when assigning the entries.  Thanks to everyone for your 
help.

For the record, this is the only GiST index I know of where the keys are over 
2000 bytes in size.  So GiST definitely handles large keys.  Perhaps the 
maximum size for intarray could be increased.

On Nov 23, 2010, at 4:01 PM, Yeb Havinga wrote:

 On 2010-11-23 20:54, Peter Tanski wrote:
 On Nov 23, 2010, at 1:37 PM, Yeb Havinga wrote:
 j = 0;
 for (i = FirstOffsetNumber; i  maxoff; i = OffsetNumberNext(i)) {
   FPrint* v = deserialize_fprint(entv[i].key);
 Isn't this off by one?  Offset numbers are 1-based, so the maxoff
 computation is wrong.
 The first for loop of all others compare with i= maxoff instead of i  
 maxoff.
 You are right: I am missing the last one, there.  (During a memory-debugging 
 phase entv[entryvec-n - 1] was always invalid, probably as a memory 
 overwrite error but I fixed that later and never changed it back.)
 
 On the other hand, there are two problems:
 
 1. the maximum size on a GiST page is 4240 bytes, so I cannot add a 
 full-size Datum using this kind of hash-key setup (the base Datum size is 
 4230 bytes on a 64-bit machine).  The example test cases I used were smaller 
 in order to get around that issue: they are 2326 bytes base size.
 
 2. Even after fixing the Picksplit() loop, the dropped-leaf problem still 
 manifests itself:
 I noticed an n_entries intialization in one of your earlier mails that might 
 also be a source of trouble. I was under the impression that gistentryvectors 
 have n-1 entries (not n-2 as you say), because the first element (0 / 
 InvalidOffsetNumber) must be skipped. E.g. entryvec-n = 5. This means that 
 there are 4 entries, which are in array positions 1,2,3,4.
 
 btw: interesting topic, audio fingerprinting!
 
 regards,
 Yeb Havinga
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-22 Thread Peter Tanski

I have been working on a plugin for GiST that has some unusual features:

* The data type for both Node and Leaf keys is large (typically 4222  
bytes on 32-bit; 4230 bytes on 64-bit).


* Due to the large size the storage class is EXTENDED (main would only  
degrade to EXTENDED in any case).  Tests using EXTERNAL show the same  
results so I do not believe compression is an issue.


* This is a hash-type index: the values are large hashes for an audio  
fingerprinting application and the sole relationship between keys is a  
float match probability between two values.  For example, if A matches  
B with a score of 0.18 but A matches C with a score of 0.61, A is  
closer to C than to B.  The distance metric is calculated in reverse  
(1.0 - match_score) so the penalty for inserting A under the same Node  
as B would be 0.82 (scaled by 1000 to 820).


* The Leaf Keys contain the same data as the rows themselves.

* The Node (union) Keys are:

  - a bitwise OR of the lower-level Leaf Keys

  - a penalty or consistency comparison of Leaf or Query (L) value  
against a Node union-key (N) is effectively a scaled hamming distance  
of:

  L ^ (L  N)
so if L is under N the check will always return 1.0 (true for  
consistency; 0.0 for penalty);
by the same operation, newly inserted values compare closer to  
Node (union) keys where the corresponding Leaf keys are closer


  - Comparisons between two Nodes, N1 and N2 (used in Same(), for  
example)  have used a:

-- Tanimoto bit distance popcount(N1  N2) / popcount(N1 | N2);
-- the same scaled-hamming distance check used for a Leaf against  
another Leaf;

-- simple memcmp for identity.

Whatever test I use for Same(),  Penalty() and Consistent() does not  
seem to affect the problem significantly.  For now I am only using  
Consistent() as a check for retrieval.


I have developed this under debug source-builds of postgresql 8.4.5  
and 9.0.1 on Mac OS X 10.6 (Snow Leopard, x86 64-bit) and Ubuntu Linux  
10.04 (Lucid, x86 64-bit and 32-bit).  There is no difference between  
the platforms or architectures.  I am using the standard PostgreSQL  
Makefile build setup so compiler flags are the same as used by  
PostgreSQL.


The problem may be my own understanding of Picksplit(), Penalty() and  
Same().  Using gevel, on a table with 500 newly-inserted rows:


postgres=# \d fps2
   Table public.fps2
   Column| Type  | Modifiers
-+---+---
 soid| character(24) | not null
 fingerprint | fprint| not null
Indexes:
fps2_fingerprint_ix gist (fingerprint)

postgres=# select gist_stat('fps2_fingerprint_ix');
gist_stat
-
 Number of levels:  4   +
 Number of pages:   61  +
 Number of leaf pages:  42  +
 Number of tuples:  193 +
 Number of invalid tuples:  0   +
 Number of leaf tuples: 133 +
 Total size of tuples:  271704 bytes+
 Total size of leaf tuples: 187236 bytes+
 Total size of index:   499712 bytes+


Note that there are only 133 leaf tuples -- for 500 rows.  If the  
index process were operating correctly, there should have been 500  
leaf tuples there.  If I REINDEX the table the number of leaf tuples  
may change slightly but not by much.  This closely corresponds to a  
query:


postgres=# select count(*) from fps2 f1 join fps2 f2 on f1.fingerprint  
= f2.fingerprint;

 count
---
   133

where = is a match operator that returns rows where the Leaf key  
comparison is  .98 (on the scaled probability score pretty much  
exactly equal).  The above query is using the index:


postgres=# explain select count(*) from fps2 f1 join fps2 f2 on  
f1.fingerprint ~= f2.fingerprint;

   QUERY PLAN

 Aggregate  (cost=1173.67..1173.68 rows=1 width=0)
   -  Nested Loop  (cost=0.00..1170.54 rows=1250 width=0)
 -  Seq Scan on fps2 f1  (cost=0.00..105.00 rows=500 width=32)
 -  Index Scan using fps2_fingerprint_ix on fps2 f2   
(cost=0.00..2.11 rows=2 width=32)

   Index Cond: (f1.fingerprint ~= f2.fingerprint)


If I use a table scan instead of the index, the query would return:

postgres=# select count(*) from fps2 f1 join fps2 f2 on  
fprint_cmp(f1.fingerprint, f2.fingerprint)  .98;

 count
---
   500


The GiST tree looks like:

postgres=# select gist_tree('fps2_fingerprint_ix');
  gist_tree
--
 0(l:0) blk: 0 numTuple: 4 free: 2532b(68.97%) rightlink:4294967295  
(InvalidBlockNumber) +
 1(l:1) blk: 37 numTuple: 2 free: 5340b(34.56%) rightlink:105  
(OK)   +
 1(l:2) blk: 90 numTuple: 3 free: 3936b(51.76%) 

Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-22 Thread Peter Tanski

One minor correction:

postgres=# explain select count(*) from fps2 f1 join fps2 f2 on  
f1.fingerprint = f2.fingerprint;

   QUERY PLAN

 Aggregate  (cost=1173.67..1173.68 rows=1 width=0)
   -  Nested Loop  (cost=0.00..1170.54 rows=1250 width=0)
 -  Seq Scan on fps2 f1  (cost=0.00..105.00 rows=500 width=32)
 -  Index Scan using fps2_fingerprint_ix on fps2 f2   
(cost=0.00..2.11 rows=2 width=32)

   Index Cond: (f1.fingerprint = f2.fingerprint)

(The previous query example used the ~= operator which was defined to  
match at  .5 but in this case there no matches in the table so ~= is  
the same as =.)


On Nov 22, 2010, at 4:18 PM, Peter Tanski wrote:

I have been working on a plugin for GiST that has some unusual  
features:


* The data type for both Node and Leaf keys is large (typically 4222  
bytes on 32-bit; 4230 bytes on 64-bit).


* Due to the large size the storage class is EXTENDED (main would  
only degrade to EXTENDED in any case).  Tests using EXTERNAL show  
the same results so I do not believe compression is an issue.


* This is a hash-type index: the values are large hashes for an  
audio fingerprinting application and the sole relationship between  
keys is a float match probability between two values.  For example,  
if A matches B with a score of 0.18 but A matches C with a score of  
0.61, A is closer to C than to B.  The distance metric is calculated  
in reverse (1.0 - match_score) so the penalty for inserting A under  
the same Node as B would be 0.82 (scaled by 1000 to 820).


* The Leaf Keys contain the same data as the rows themselves.

* The Node (union) Keys are:

 - a bitwise OR of the lower-level Leaf Keys

 - a penalty or consistency comparison of Leaf or Query (L) value  
against a Node union-key (N) is effectively a scaled hamming  
distance of:

 L ^ (L  N)
   so if L is under N the check will always return 1.0 (true for  
consistency; 0.0 for penalty);
   by the same operation, newly inserted values compare closer to  
Node (union) keys where the corresponding Leaf keys are closer


 - Comparisons between two Nodes, N1 and N2 (used in Same(), for  
example)  have used a:

   -- Tanimoto bit distance popcount(N1  N2) / popcount(N1 | N2);
   -- the same scaled-hamming distance check used for a Leaf against  
another Leaf;

   -- simple memcmp for identity.

Whatever test I use for Same(),  Penalty() and Consistent() does not  
seem to affect the problem significantly.  For now I am only using  
Consistent() as a check for retrieval.


I have developed this under debug source-builds of postgresql 8.4.5  
and 9.0.1 on Mac OS X 10.6 (Snow Leopard, x86 64-bit) and Ubuntu  
Linux 10.04 (Lucid, x86 64-bit and 32-bit).  There is no difference  
between the platforms or architectures.  I am using the standard  
PostgreSQL Makefile build setup so compiler flags are the same as  
used by PostgreSQL.


The problem may be my own understanding of Picksplit(), Penalty()  
and Same().  Using gevel, on a table with 500 newly-inserted rows:


postgres=# \d fps2
  Table public.fps2
  Column| Type  | Modifiers
-+---+---
soid| character(24) | not null
fingerprint | fprint| not null
Indexes:
   fps2_fingerprint_ix gist (fingerprint)

postgres=# select gist_stat('fps2_fingerprint_ix');
   gist_stat
-
Number of levels:  4   +
Number of pages:   61  +
Number of leaf pages:  42  +
Number of tuples:  193 +
Number of invalid tuples:  0   +
Number of leaf tuples: 133 +
Total size of tuples:  271704 bytes+
Total size of leaf tuples: 187236 bytes+
Total size of index:   499712 bytes+


Note that there are only 133 leaf tuples -- for 500 rows.  If the  
index process were operating correctly, there should have been 500  
leaf tuples there.  If I REINDEX the table the number of leaf tuples  
may change slightly but not by much.  This closely corresponds to a  
query:


postgres=# select count(*) from fps2 f1 join fps2 f2 on  
f1.fingerprint = f2.fingerprint;

count
---
  133

where = is a match operator that returns rows where the Leaf key  
comparison is  .98 (on the scaled probability score pretty much  
exactly equal).  The above query is using the index:


postgres=# explain select count(*) from fps2 f1 join fps2 f2 on  
f1.fingerprint ~= f2.fingerprint;

  QUERY PLAN

Aggregate  (cost=1173.67..1173.68 rows=1 width=0)
  -  Nested Loop  (cost=0.00..1170.54 rows=1250 width=0)
-  Seq Scan on fps2 f1  (cost=0.00..105.00 rows=500 width=32)
-  Index Scan using fps2_fingerprint_ix