Re: [HACKERS] -head build error report
Joshua D. Drake wrote: Linux jd-laptop 2.6.24-19-generic #1 SMP Wed Jun 4 16:35:01 UTC 2008 i686 GNU/Linux Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c ++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr --enable-targets=all --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu Thread model: posix gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7) de -D_GNU_SOURCE -c -o gistget.o gistget.c In file included from gistget.c:21: ../../../../src/include/pgstat.h:15:36: error: portability/instr_time.h: No such file or directory In file included from gistget.c:21: ../../../../src/include/pgstat.h:326: error: expected specifier-qualifier-list before ‘instr_time’ ../../../../src/include/pgstat.h:566: error: expected specifier-qualifier-list before ‘instr_time’ make[4]: *** [gistget.o] Error 1 make[4]: Leaving directory `/home/jd/repos/pgsql/src/backend/access/gist' make[3]: *** [gist-recursive] Error 2 make[3]: Leaving directory `/home/jd/repos/pgsql/src/backend/access' make[2]: *** [access-recursive] Error 2 make[2]: Leaving directory `/home/jd/repos/pgsql/src/backend' make[1]: *** [all] Error 2 make[1]: Leaving directory `/home/jd/repos/pgsql/src' make: *** [all] Error 2 Looks like you do not have the right CVS flags set. You need to use -d when you do a cvs update or you won't pick up new directories. You should really have this set in your .cvsrc file. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not valid dump [8.2.9, 8.3.1]
On Fri, Jun 20, 2008 at 4:37 PM, Tom Lane [EMAIL PROTECTED] wrote: Gaetano Mendola [EMAIL PROTECTED] writes: we have faced lately dumps not valid, the bug can be replicated using a 8.2.9 or a 8.3.1 server. These are the steps to create the database that will generate a not valid dump: This is a bug in your function: it will not work if the search path doesn't contain the public schema. You'd be best advised to make it qualify the reference to t_public explicitly. Yes, that's the way we are fixing it. Still I have a bitter taste being able to create a working database instance that doesn't generate a valid dump. (Of course you realize that referencing any table at all in an immutable function is probably a mortal sin...) Yes Tom I know, in our case that table is a lookup table, noone update, delete, insert data in it, so from my point of view it is like I have declared a static array inside the function declaration. -- cpp-today.blogspot.com
Re: [HACKERS] -head build error report
On Sat, 2008-06-21 at 07:53 -0400, Andrew Dunstan wrote: Looks like you do not have the right CVS flags set. You need to use -d when you do a cvs update or you won't pick up new directories. You should really have this set in your .cvsrc file. Sorry, this is the only project I use dead software for :P (I didn't even know there was such a thing as a .cvsrc). Thanks for the tip :) Joshua D. Drake cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline
I have an attached an updated patch with following changes: 1. Respects ShortWord and MinWords 2. Uses hlCover instead of Cover 3. Does not store norm (or lexeme) for headline marking 4. Removes ts_rank.h 5. Earlier it was counting even NONWORDTOKEN in the headline. Now it only counts the actual words and excludes spaces etc. I have also changed NumFragments option to MaxFragments as there may not be enough covers to display NumFragments. Another change that I was thinking: Right now if cover size max_words then I just cut the trailing words. Instead I was thinking that we should split the cover into more fragments such that each fragment contains a few query words. Then each fragment will not contain all query words but will show more occurrences of query words in the headline. I would like to know what your opinion on this is. -Sushant. On Thu, 2008-06-05 at 20:21 +0400, Teodor Sigaev wrote: A couple of caveats: 1. ts_headline testing was done with current cvs head where as headline_with_fragments was done with postgres 8.3.1. 2. For headline_with_fragments, TSVector for the document was obtained by joining with another table. Are these differences understandable? That is possible situation because ts_headline has several criterias of 'best' covers - length, number of words from query, good words at the begin and at the end of headline while your fragment's algorithm takes care only on total number of words in all covers. It's not very good, but it's acceptable, I think. Headline (and ranking too) hasn't any formal rules to define is it good or bad? Just a people's opinions. Next possible reason: original algorithm had a look on all covers trying to find the best one while your algorithm tries to find just the shortest covers to fill a headline. But it's very desirable to use ShortWord - it's not very comfortable for user if one option produces unobvious side effect with another one. ` If you think these caveats are the reasons or there is something I am missing, then I can repeat the entire experiments with exactly the same conditions. Interesting for me test is a comparing hlCover with Cover in your patch, i.e. develop a patch which uses hlCover instead of Cover and compare old patch with new one. Index: src/backend/tsearch/wparser_def.c === RCS file: /home/sushant/devel/pgsql-cvs/pgsql/src/backend/tsearch/wparser_def.c,v retrieving revision 1.14 diff -c -r1.14 wparser_def.c *** src/backend/tsearch/wparser_def.c 1 Jan 2008 19:45:52 - 1.14 --- src/backend/tsearch/wparser_def.c 21 Jun 2008 07:59:02 - *** *** 1684,1701 return false; } ! Datum ! prsd_headline(PG_FUNCTION_ARGS) { ! HeadlineParsedText *prs = (HeadlineParsedText *) PG_GETARG_POINTER(0); ! List *prsoptions = (List *) PG_GETARG_POINTER(1); ! TSQuery query = PG_GETARG_TSQUERY(2); ! /* from opt + start and and tag */ ! int min_words = 15; ! int max_words = 35; ! int shortword = 3; int p = 0, q = 0; int bestb = -1, --- 1684,1891 return false; } ! static void ! mark_fragment(HeadlineParsedText *prs, int highlight, int startpos, int endpos) { ! int i; ! char *coversep = ...; ! int coverlen = strlen(coversep); ! for (i = startpos; i = endpos; i++) ! { ! if (prs-words[i].item) ! prs-words[i].selected = 1; ! if (highlight == 0) ! { ! if (HLIDIGNORE(prs-words[i].type)) ! prs-words[i].replace = 1; ! } ! else ! { ! if (XMLHLIDIGNORE(prs-words[i].type)) ! prs-words[i].replace = 1; ! } ! ! prs-words[i].in = (prs-words[i].repeated) ? 0 : 1; ! } ! /* add cover separators if needed */ ! if (startpos 0 strncmp(prs-words[startpos-1].word, coversep, ! prs-words[startpos-1].len) != 0) ! { ! ! prs-words[startpos-1].word = repalloc(prs-words[startpos-1].word, sizeof(char) * coverlen); ! prs-words[startpos-1].in = 1; ! prs-words[startpos-1].len = coverlen; ! memcpy(prs-words[startpos-1].word, coversep, coverlen); ! } ! if (endpos-1 prs-curwords strncmp(prs-words[startpos-1].word, coversep, ! prs-words[startpos-1].len) != 0) ! { ! prs-words[endpos+1].word = repalloc(prs-words[endpos+1].word, sizeof(char) * coverlen); ! prs-words[endpos+1].in = 1; ! memcpy(prs-words[endpos+1].word, coversep, coverlen); ! } ! } ! ! typedef struct ! { ! int4 startpos; ! int4 endpos; ! int2 in; ! int2 excluded; ! } CoverPos; ! ! ! static void ! mark_hl_fragments(HeadlineParsedText *prs, TSQuery query, int highlight, ! int shortword, int min_words, ! int max_words, int max_fragments) ! { ! int4 curlen, coverlen, i, f, num_f; ! int4 stretch, maxstretch; ! ! int4 startpos = 0, ! endpos = 0, ! p= 0, ! q= 0; ! ! int4 numcovers = 0, ! maxcovers = 32; ! ! int4
Re: [HACKERS] -head build error report
Joshua D. Drake wrote: On Sat, 2008-06-21 at 07:53 -0400, Andrew Dunstan wrote: Looks like you do not have the right CVS flags set. You need to use -d when you do a cvs update or you won't pick up new directories. You should really have this set in your .cvsrc file. Sorry, this is the only project I use dead software for :P (I didn't even know there was such a thing as a .cvsrc). Thanks for the tip :) http://wiki.postgresql.org/wiki/Working_with_CVS has even more :P Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not valid dump [8.2.9, 8.3.1]
Gaetano Mendola [EMAIL PROTECTED] writes: On Fri, Jun 20, 2008 at 4:37 PM, Tom Lane [EMAIL PROTECTED] wrote: (Of course you realize that referencing any table at all in an immutable function is probably a mortal sin...) Yes Tom I know, in our case that table is a lookup table, noone update, delete, insert data in it, so from my point of view it is like I have declared a static array inside the function declaration. No, you'd like to imagine that it is a static array, but that technique is just a foot-gun waiting to bite you. As an example, since pg_dump has no idea that that function has any dependency on the lookup table, there is nothing to stop it from trying to create the index before it's populated the lookup table. (I think it probably works for you at the moment because pg_dump tends to fill all the tables before creating any indexes, but the planned changes to support multi-threaded restores will certainly break your case.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not valid dump [8.2.9, 8.3.1]
Tom Lane wrote: Gaetano Mendola [EMAIL PROTECTED] writes: On Fri, Jun 20, 2008 at 4:37 PM, Tom Lane [EMAIL PROTECTED] wrote: (Of course you realize that referencing any table at all in an immutable function is probably a mortal sin...) Yes Tom I know, in our case that table is a lookup table, noone update, delete, insert data in it, so from my point of view it is like I have declared a static array inside the function declaration. No, you'd like to imagine that it is a static array, but that technique is just a foot-gun waiting to bite you. As an example, since pg_dump has no idea that that function has any dependency on the lookup table, there is nothing to stop it from trying to create the index before it's populated the lookup table. (I think it probably works for you at the moment because pg_dump tends to fill all the tables before creating any indexes, but the planned changes to support multi-threaded restores will certainly break your case.) Purely static lookup tables can also often be replaced by enum types, often with significant efficiency gains. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash index build patch has *worse* performance at small table sizes
Did we ever do anything about this? --- Tom Lane wrote: I've been reviewing the hash index build patch submitted here: http://archives.postgresql.org/pgsql-patches/2007-10/msg00154.php Although it definitely helps on large indexes, it's actually counterproductive on not-so-large ones. The test case I'm using is random integers generated like this: create table foo as select (random() * N)::int as f1 from generate_series(1,N); select count(*) from foo; -- force hint bit updates checkpoint; then timing create index fooi on foo using hash(f1); Using all-default configuration settings on some not-very-new hardware, at N = 1E6 I see 8.3.1:30 sec With pre-expansion of index (CVS HEAD): 24 sec With sorting: 72 sec To build a btree index on same data: 34 sec Now this isn't amazingly surprising, because the original argument for doing sorting was to improve locality of access to the index during the build, and that only matters if you've got an index significantly bigger than memory. If the index fits in RAM then the sort is pure overhead. The obvious response to this is to use the sorting approach only when the estimated index size exceeds some threshold. One possible choice of threshold would be shared_buffers (or temp_buffers for a temp index) but I think that is probably too conservative, because in most scenarios the kernel's disk cache is available too. Plus you can't tweak that setting without a postmaster restart. I'm tempted to use effective_cache_size, which attempts to measure an appropriate number and can be set locally within the session doing the CREATE INDEX if necessary. Or we could invent a new GUC parameter, but that is probably overkill. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash index build patch has *worse* performance at small table sizes
Bruce Momjian [EMAIL PROTECTED] writes: Did we ever do anything about this? Seems to be in there in CVS HEAD: /* * If we just insert the tuples into the index in scan order, then * (assuming their hash codes are pretty random) there will be no locality * of access to the index, and if the index is bigger than available RAM * then we'll thrash horribly. To prevent that scenario, we can sort the * tuples by (expected) bucket number. However, such a sort is useless * overhead when the index does fit in RAM. We choose to sort if the * initial index size exceeds effective_cache_size. * * NOTE: this test will need adjustment if a bucket is ever different * from one page. */ if (num_buckets = (uint32) effective_cache_size) buildstate.spool = _h_spoolinit(index, num_buckets); else buildstate.spool = NULL; regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PG Pool Party (formerly MomjiCon) date set
Bruce Momjian is having a pool party and barbecue at his house on Saturday, August 9, for Postgres community members in the Philadelphia area and nearby states. Families are welcome. Food and drink will be provided, and, of course, swimming is encouraged. Please come anytime between 2pm and 7pm. For directions see: http://momjian.us/main/directions.html If you are thinking of attending, please email [EMAIL PROTECTED] -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers