Re: [HACKERS] -head build error report

2008-06-21 Thread Andrew Dunstan




Joshua D. Drake wrote:

Linux jd-laptop 2.6.24-19-generic #1 SMP Wed Jun 4 16:35:01 UTC 2008
i686 GNU/Linux

Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --enable-languages=c,c
++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --enable-nls
--with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2
--enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc
--enable-mpfr --enable-targets=all --enable-checking=release
--build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)



de -D_GNU_SOURCE   -c -o gistget.o gistget.c
In file included from gistget.c:21:
../../../../src/include/pgstat.h:15:36: error: portability/instr_time.h:
No such file or directory
In file included from gistget.c:21:
../../../../src/include/pgstat.h:326: error: expected
specifier-qualifier-list before ‘instr_time’
../../../../src/include/pgstat.h:566: error: expected
specifier-qualifier-list before ‘instr_time’
make[4]: *** [gistget.o] Error 1
make[4]: Leaving directory
`/home/jd/repos/pgsql/src/backend/access/gist'
make[3]: *** [gist-recursive] Error 2
make[3]: Leaving directory `/home/jd/repos/pgsql/src/backend/access'
make[2]: *** [access-recursive] Error 2
make[2]: Leaving directory `/home/jd/repos/pgsql/src/backend'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/home/jd/repos/pgsql/src'
make: *** [all] Error 2

  



Looks like you do not have the right CVS flags set. You need to use -d 
when you do a cvs update or you won't pick up new directories.


You should really have this set in your .cvsrc file.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Not valid dump [8.2.9, 8.3.1]

2008-06-21 Thread Gaetano Mendola
On Fri, Jun 20, 2008 at 4:37 PM, Tom Lane [EMAIL PROTECTED] wrote:

 Gaetano Mendola [EMAIL PROTECTED] writes:
  we have faced lately dumps not valid, the bug can be replicated using a
 8.2.9 or
  a 8.3.1 server.

  These are the steps to create the database that will generate a not valid
 dump:

 This is a bug in your function: it will not work if the search path
 doesn't contain the public schema.  You'd be best advised to make it
 qualify the reference to t_public explicitly.


Yes, that's the way we are fixing it. Still I have a bitter taste being able
to
create a working database instance that doesn't generate a valid dump.

(Of course you realize that referencing any table at all in an
 immutable function is probably a mortal sin...)


Yes Tom I know, in our case that table is a lookup table, noone update,
delete, insert data in it, so from my point of view it is like I have
declared a
static array inside the function declaration.

-- 
cpp-today.blogspot.com


Re: [HACKERS] -head build error report

2008-06-21 Thread Joshua D. Drake


On Sat, 2008-06-21 at 07:53 -0400, Andrew Dunstan wrote:

 
 
 Looks like you do not have the right CVS flags set. You need to use -d 
 when you do a cvs update or you won't pick up new directories.
 
 You should really have this set in your .cvsrc file.

Sorry, this is the only project I use dead software for :P (I didn't
even know there was such a thing as a .cvsrc). 

Thanks for the tip :)

Joshua D. Drake

 
 cheers
 
 andrew
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline

2008-06-21 Thread Sushant Sinha
I have an attached an updated patch with following changes:

1. Respects ShortWord and MinWords
2. Uses hlCover instead of Cover
3. Does not store norm (or lexeme) for headline marking
4. Removes ts_rank.h
5. Earlier it was counting even NONWORDTOKEN in the headline. Now it
only counts the actual words and excludes spaces etc.

I have also changed NumFragments option to MaxFragments as there may not
be enough covers to display NumFragments.

Another change that I was thinking:

Right now if cover size  max_words then I just cut the trailing words.
Instead I was thinking that we should split the cover into more
fragments such that each fragment contains a few query words. Then each
fragment will not contain all query words but will show more occurrences
of query words in the headline. I would  like to know what your opinion
on this is.

-Sushant.

On Thu, 2008-06-05 at 20:21 +0400, Teodor Sigaev wrote:
  A couple of caveats: 
  
  1. ts_headline testing was done with current cvs head where as
  headline_with_fragments was done with postgres 8.3.1.
  2. For headline_with_fragments, TSVector for the document was obtained
  by joining with another table.
  Are these differences understandable?
 
 That is possible situation because ts_headline has several criterias of 
 'best' 
 covers - length, number of words from query, good words at the begin and at 
 the 
 end of headline while your fragment's algorithm takes care only on total 
 number 
 of words in all covers. It's not very good, but it's acceptable, I think. 
 Headline (and ranking too) hasn't any formal rules to define is it good or 
 bad? 
 Just a people's opinions.
 
 Next possible reason: original algorithm had a look on all covers trying to 
 find 
 the best one while your algorithm tries to find just the shortest covers to 
 fill 
 a headline.
 
 But it's very desirable to use ShortWord - it's not very comfortable for user 
 if 
 one option produces unobvious side effect with another one.
 `
 
  If you think these caveats are the reasons or there is something I am
  missing, then I can repeat the entire experiments with exactly the same
  conditions. 
 
 Interesting for me test is a comparing hlCover with Cover in your patch, i.e. 
 develop a patch which uses hlCover instead of Cover and compare  old patch 
 with 
 new one.
Index: src/backend/tsearch/wparser_def.c
===
RCS file: /home/sushant/devel/pgsql-cvs/pgsql/src/backend/tsearch/wparser_def.c,v
retrieving revision 1.14
diff -c -r1.14 wparser_def.c
*** src/backend/tsearch/wparser_def.c	1 Jan 2008 19:45:52 -	1.14
--- src/backend/tsearch/wparser_def.c	21 Jun 2008 07:59:02 -
***
*** 1684,1701 
  	return false;
  }
  
! Datum
! prsd_headline(PG_FUNCTION_ARGS)
  {
! 	HeadlineParsedText *prs = (HeadlineParsedText *) PG_GETARG_POINTER(0);
! 	List	   *prsoptions = (List *) PG_GETARG_POINTER(1);
! 	TSQuery		query = PG_GETARG_TSQUERY(2);
  
! 	/* from opt + start and and tag */
! 	int			min_words = 15;
! 	int			max_words = 35;
! 	int			shortword = 3;
  
  	int			p = 0,
  q = 0;
  	int			bestb = -1,
--- 1684,1891 
  	return false;
  }
  
! static void 
! mark_fragment(HeadlineParsedText *prs, int highlight, int startpos, int endpos)
  {
! 	int   i;
! 	char *coversep = ...;
!	int   coverlen = strlen(coversep);
  
! 	for (i = startpos; i = endpos; i++)
! 	{
! 		if (prs-words[i].item)
! 			prs-words[i].selected = 1;
! 		if (highlight == 0)
! 		{
! 			if (HLIDIGNORE(prs-words[i].type))
! prs-words[i].replace = 1;
! 		}
! 		else
! 		{
! 			if (XMLHLIDIGNORE(prs-words[i].type))
! prs-words[i].replace = 1;
! 		}
! 
! 		prs-words[i].in = (prs-words[i].repeated) ? 0 : 1;
! 	}
! 	/* add cover separators if needed */ 
! 	if (startpos  0  strncmp(prs-words[startpos-1].word, coversep, 
! 		prs-words[startpos-1].len) != 0)
! 	{
! 		
! 		prs-words[startpos-1].word = repalloc(prs-words[startpos-1].word, sizeof(char) * coverlen);
! 		prs-words[startpos-1].in   = 1;
! 		prs-words[startpos-1].len  = coverlen;
! 		memcpy(prs-words[startpos-1].word, coversep, coverlen);
! 	}
! 	if (endpos-1  prs-curwords   strncmp(prs-words[startpos-1].word, coversep,
! 		prs-words[startpos-1].len) != 0)
! 	{
! 		prs-words[endpos+1].word = repalloc(prs-words[endpos+1].word, sizeof(char) * coverlen);
! 		prs-words[endpos+1].in   = 1;
! 		memcpy(prs-words[endpos+1].word, coversep, coverlen);
! 	}
! }
! 
! typedef struct 
! {
! 	int4 startpos;
! 	int4 endpos;
! 	int2 in;
! 	int2 excluded;
! } CoverPos;
! 
! 
! static void
! mark_hl_fragments(HeadlineParsedText *prs, TSQuery query, int highlight,
! int shortword, int min_words, 
! 			int max_words, int max_fragments)
! {
! 	int4   	curlen, coverlen, i, f, num_f;
! 	int4		stretch, maxstretch;
! 
! 	int4   	startpos = 0, 
!  			endpos   = 0,
! 			p= 0,
! 			q= 0;
! 
! 	int4		numcovers = 0, 
! 			maxcovers = 32;
! 
! 	int4

Re: [HACKERS] -head build error report

2008-06-21 Thread Stefan Kaltenbrunner

Joshua D. Drake wrote:


On Sat, 2008-06-21 at 07:53 -0400, Andrew Dunstan wrote:
   



Looks like you do not have the right CVS flags set. You need to use -d 
when you do a cvs update or you won't pick up new directories.


You should really have this set in your .cvsrc file.


Sorry, this is the only project I use dead software for :P (I didn't
even know there was such a thing as a .cvsrc). 


Thanks for the tip :)


http://wiki.postgresql.org/wiki/Working_with_CVS has even more :P


Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Not valid dump [8.2.9, 8.3.1]

2008-06-21 Thread Tom Lane
Gaetano Mendola [EMAIL PROTECTED] writes:
 On Fri, Jun 20, 2008 at 4:37 PM, Tom Lane [EMAIL PROTECTED] wrote:
 (Of course you realize that referencing any table at all in an
 immutable function is probably a mortal sin...)

 Yes Tom I know, in our case that table is a lookup table, noone update,
 delete, insert data in it, so from my point of view it is like I have
 declared a static array inside the function declaration.

No, you'd like to imagine that it is a static array, but that technique
is just a foot-gun waiting to bite you.  As an example, since pg_dump
has no idea that that function has any dependency on the lookup table,
there is nothing to stop it from trying to create the index before it's
populated the lookup table.

(I think it probably works for you at the moment because pg_dump tends
to fill all the tables before creating any indexes, but the planned
changes to support multi-threaded restores will certainly break your
case.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Not valid dump [8.2.9, 8.3.1]

2008-06-21 Thread Andrew Dunstan



Tom Lane wrote:

Gaetano Mendola [EMAIL PROTECTED] writes:
  

On Fri, Jun 20, 2008 at 4:37 PM, Tom Lane [EMAIL PROTECTED] wrote:
(Of course you realize that referencing any table at all in an


immutable function is probably a mortal sin...)
  


  

Yes Tom I know, in our case that table is a lookup table, noone update,
delete, insert data in it, so from my point of view it is like I have
declared a static array inside the function declaration.



No, you'd like to imagine that it is a static array, but that technique
is just a foot-gun waiting to bite you.  As an example, since pg_dump
has no idea that that function has any dependency on the lookup table,
there is nothing to stop it from trying to create the index before it's
populated the lookup table.

(I think it probably works for you at the moment because pg_dump tends
to fill all the tables before creating any indexes, but the planned
changes to support multi-threaded restores will certainly break your
case.)


  


Purely static lookup tables can also often be replaced by enum types, 
often with significant efficiency gains.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash index build patch has *worse* performance at small table sizes

2008-06-21 Thread Bruce Momjian

Did we ever do anything about this?

---

Tom Lane wrote:
 I've been reviewing the hash index build patch submitted here:
 http://archives.postgresql.org/pgsql-patches/2007-10/msg00154.php
 
 Although it definitely helps on large indexes, it's actually
 counterproductive on not-so-large ones.  The test case I'm using
 is random integers generated like this:
   create table foo as select (random() * N)::int as f1
 from generate_series(1,N);
   select count(*) from foo; -- force hint bit updates
   checkpoint;
 then timing
   create index fooi on foo using hash(f1);
 
 Using all-default configuration settings on some not-very-new hardware,
 at N = 1E6 I see
 
 8.3.1:30 sec
 With pre-expansion of index (CVS HEAD):   24 sec
 With sorting: 72 sec
 To build a btree index on same data:  34 sec
 
 Now this isn't amazingly surprising, because the original argument for
 doing sorting was to improve locality of access to the index during
 the build, and that only matters if you've got an index significantly
 bigger than memory.  If the index fits in RAM then the sort is pure
 overhead.
 
 The obvious response to this is to use the sorting approach only when
 the estimated index size exceeds some threshold.  One possible choice of
 threshold would be shared_buffers (or temp_buffers for a temp index)
 but I think that is probably too conservative, because in most scenarios
 the kernel's disk cache is available too.  Plus you can't tweak that
 setting without a postmaster restart.  I'm tempted to use
 effective_cache_size, which attempts to measure an appropriate number
 and can be set locally within the session doing the CREATE INDEX if
 necessary.  Or we could invent a new GUC parameter, but that is probably
 overkill.
 
 Comments?
 
   regards, tom lane
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash index build patch has *worse* performance at small table sizes

2008-06-21 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Did we ever do anything about this?

Seems to be in there in CVS HEAD:

/*
 * If we just insert the tuples into the index in scan order, then
 * (assuming their hash codes are pretty random) there will be no locality
 * of access to the index, and if the index is bigger than available RAM
 * then we'll thrash horribly.  To prevent that scenario, we can sort the
 * tuples by (expected) bucket number.  However, such a sort is useless
 * overhead when the index does fit in RAM.  We choose to sort if the
 * initial index size exceeds effective_cache_size.
 *
 * NOTE: this test will need adjustment if a bucket is ever different
 * from one page.
 */
if (num_buckets = (uint32) effective_cache_size)
buildstate.spool = _h_spoolinit(index, num_buckets);
else
buildstate.spool = NULL;


regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PG Pool Party (formerly MomjiCon) date set

2008-06-21 Thread Bruce Momjian
Bruce Momjian is having a pool party and barbecue at his house on
Saturday, August 9, for Postgres community members in the Philadelphia
area and nearby states. Families are welcome. Food and drink will be
provided, and, of course, swimming is encouraged.  Please come anytime
between 2pm and 7pm.

For directions see:

http://momjian.us/main/directions.html

If you are thinking of attending, please email [EMAIL PROTECTED]

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers