Re: [PATCHES] psql slash# command

2008-04-10 Thread Sibte Abbas
On Thu, Apr 3, 2008 at 7:54 PM, Sibte Abbas <[EMAIL PROTECTED]> wrote:

> On Sun, Mar 30, 2008 at 3:09 PM, Tom Lane <[EMAIL PROTECTED]> 
> wrote:
>
>
> > I wonder whether it wouldn't be safer and more convenient if we defined
> > '\# n' as pulling command n into the edit buffer, rather than
> > immediately executing it.  Actual execution is only a  away,
> > but this definition would allow you to edit the command a bit more
> > before you execute it --- including \e to use an editor.  It also
> > closes the loop in terms of providing some confidence that you typed
> > the number you should have typed.
> >
>
> This makes more sense and also appears to be much safer. I will start
> modifying the patch as per this approach now.
>
>
Based on your feedback I have modified the attached patch as follows:

1) \# n opens command n into the edit buffer.

2) A new psql variable; SHOW_LINE_NO is added, which is consulted by the \s
command.
If it is set, \s prefixs each line of its output with an incrementing
line number.

regards,
-- Sibte Abbas
? GNUmakefile
? config.log
? config.status
? mydiff.diff
? pgsql.log
? psql_slash#_v2.patch
? psql_slash#_v3.patch
? src/Makefile.global
? src/cscope.out
? src/backend/postgres
? src/backend/access/objfiles.txt
? src/backend/access/common/objfiles.txt
? src/backend/access/gin/objfiles.txt
? src/backend/access/gist/objfiles.txt
? src/backend/access/hash/objfiles.txt
? src/backend/access/heap/objfiles.txt
? src/backend/access/index/objfiles.txt
? src/backend/access/nbtree/objfiles.txt
? src/backend/access/transam/objfiles.txt
? src/backend/bootstrap/objfiles.txt
? src/backend/catalog/objfiles.txt
? src/backend/catalog/postgres.bki
? src/backend/catalog/postgres.description
? src/backend/catalog/postgres.shdescription
? src/backend/commands/objfiles.txt
? src/backend/executor/objfiles.txt
? src/backend/lib/objfiles.txt
? src/backend/libpq/objfiles.txt
? src/backend/main/objfiles.txt
? src/backend/nodes/objfiles.txt
? src/backend/optimizer/objfiles.txt
? src/backend/optimizer/geqo/objfiles.txt
? src/backend/optimizer/path/objfiles.txt
? src/backend/optimizer/plan/objfiles.txt
? src/backend/optimizer/prep/objfiles.txt
? src/backend/optimizer/util/objfiles.txt
? src/backend/parser/objfiles.txt
? src/backend/port/objfiles.txt
? src/backend/postmaster/objfiles.txt
? src/backend/regex/objfiles.txt
? src/backend/rewrite/objfiles.txt
? src/backend/snowball/libdict_snowball.so.0.0
? src/backend/snowball/snowball_create.sql
? src/backend/storage/objfiles.txt
? src/backend/storage/buffer/objfiles.txt
? src/backend/storage/file/objfiles.txt
? src/backend/storage/freespace/objfiles.txt
? src/backend/storage/ipc/objfiles.txt
? src/backend/storage/large_object/objfiles.txt
? src/backend/storage/lmgr/objfiles.txt
? src/backend/storage/page/objfiles.txt
? src/backend/storage/smgr/objfiles.txt
? src/backend/tcop/objfiles.txt
? src/backend/tsearch/objfiles.txt
? src/backend/utils/objfiles.txt
? src/backend/utils/probes.h
? src/backend/utils/adt/objfiles.txt
? src/backend/utils/cache/objfiles.txt
? src/backend/utils/error/objfiles.txt
? src/backend/utils/fmgr/objfiles.txt
? src/backend/utils/hash/objfiles.txt
? src/backend/utils/init/objfiles.txt
? src/backend/utils/mb/objfiles.txt
? src/backend/utils/mb/conversion_procs/conversion_create.sql
? src/backend/utils/mb/conversion_procs/ascii_and_mic/libascii_and_mic.so.0.0
? src/backend/utils/mb/conversion_procs/cyrillic_and_mic/libcyrillic_and_mic.so.0.0
? src/backend/utils/mb/conversion_procs/euc_cn_and_mic/libeuc_cn_and_mic.so.0.0
? src/backend/utils/mb/conversion_procs/euc_jis_2004_and_shift_jis_2004/libeuc_jis_2004_and_shift_jis_2004.so.0.0
? src/backend/utils/mb/conversion_procs/euc_jp_and_sjis/libeuc_jp_and_sjis.so.0.0
? src/backend/utils/mb/conversion_procs/euc_kr_and_mic/libeuc_kr_and_mic.so.0.0
? src/backend/utils/mb/conversion_procs/euc_tw_and_big5/libeuc_tw_and_big5.so.0.0
? src/backend/utils/mb/conversion_procs/latin2_and_win1250/liblatin2_and_win1250.so.0.0
? src/backend/utils/mb/conversion_procs/latin_and_mic/liblatin_and_mic.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_ascii/libutf8_and_ascii.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_big5/libutf8_and_big5.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_cyrillic/libutf8_and_cyrillic.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_euc_cn/libutf8_and_euc_cn.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_euc_jis_2004/libutf8_and_euc_jis_2004.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_euc_jp/libutf8_and_euc_jp.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_euc_kr/libutf8_and_euc_kr.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_euc_tw/libutf8_and_euc_tw.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_gb18030/libutf8_and_gb18030.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_gbk/libutf8_and_gbk.so.0.0
? src/backend/utils/mb/con

Re: [PATCHES] Remove FATAL from pg_lzdecompress

2008-04-10 Thread Zdenek Kotala

Tom Lane napsal(a):

Zdenek Kotala <[EMAIL PROTECTED]> writes:
I attach patch which adds boundaries check and memory overwriting 
protection when compressed data are corrupted.



Good point. Is there plan to applied also on other branch?


I wasn't planning to back-patch it.  Given the lack of field reports
of compressed-data problems, it seemed to me that the risk of breaking
something was larger than the chance of helping someone.  We could
reconsider this after the code has been in HEAD awhile, perhaps.


Tom,

one of our customer with 3TB table it uses now in production (8.2) awhile (2 
weeks) and it works pretty well. He had a corrupted data in TOASTed table and 
now his system is stable without random crashes. I plan to use this patch in 
official Solaris build, but I prefer do not have differences between main stream 
and solaris binaries. Would be possible to backported this patch?


Thanks Zdenek

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Including Snapshot Info with Indexes

2008-04-10 Thread Bruce Momjian

I have added URLs to your patch to the TODO list:

* Allow data to be pulled directly from indexes

---

Gokulakannan Somasundaram wrote:
> Hi,
> I would like to present the first patch. It currently has the following
> restrictions
> a) It does not support any functional indexes.
> b) It supports queries like select count(1) from table where (restrictions
> from indexed columns), but it does not support select count(1) from table.
> 
> The Syntax to create this type of index is
> 
> create thick index idx on dd(n1,n2)
> 
> here idx- index name and dd- table name and n1 and n2 are column names.
> 
> I have created a extra column in pg_index called indhassnapshot.
> 
> I have also enabled the display of Logical Reads. In order to see that, set
> log_statement_stats on.
> 
> The thick index is clearly on the front, if you issue queries like
> 
> select n2 from dd where n1>1000 and n2<1500;
> 
> As already said, if the update is not incurring any extra cost, except if
> the indexed columns are updated. Deletes are costly, making it ideal for
> partitioned tables.
> 
> In order to update the thick indexes, i have accessed the ps_ExprContext in
> PlanState to get the oldtuple. But if we have a outer plan and inner plan,
> then i have set the ps_ExprContext of  innerplan to the outerplan. I don't
> know whether there will be instances where the ps_ExprContext of outerplan
> node will have some use in update queries.
> 
> Right now, it passes the regression test suite. I had slight trouble with
> pg_indent, so i think it has not got applied properly. But i have tried to
> remove all the whitespace differences. Please be kind to me in case i have
> missed any whitespace differences. :)
> 
> Please review the patch and provide your comments.
> 
> Thanks,
> Gokul.
> CertoSQL Project,
> Allied Solution Groups.
> (www.alliedgroups.com)
> 
> On 10/23/07, Hannu Krosing <[EMAIL PROTECTED]> wrote:
> >
> > ?hel kenal p?eval, L, 2007-10-20 kell 10:19, kirjutas Luke Lonergan:
> > > Hi Hannu,
> > >
> > > On 10/14/07 12:58 AM, "Hannu Krosing" <[EMAIL PROTECTED]> wrote:
> > >
> > > > What has happened in reality, is that the speed difference between
> > CPU,
> > > > RAM and disk speeds has _increased_ tremendously
> > >
> > > Yes.
> > >
> > > > which makes it even
> > > > more important to _decrease_ the size of stored data if you want good
> > > > performance
> > >
> > > Or bring the cpu processing closer to the data it's using (or both).
> > >
> > > By default, the trend you mention first will continue in an unending way
> > -
> > > the consequence is that the "distance" between a processor and it's
> > target
> > > data will continue to increase ad-infinitum.
> >
> > the emergence of solid-state (flash) disks may help a little here, but
> > in general it is true.
> >
> > > By contrast, you can only decrease the data volume so much - so in the
> > end
> > > you'll be left with the same problem - the data needs to be closer to
> > the
> > > processing.  This is the essence of parallel / shared nothing
> > architecture.
> > >
> > > Note that we've done this at Greenplum.  We're also implementing a
> > DSM-like
> > > capability and are investigating a couple of different hybrid row /
> > column
> > > store approaches.
> >
> > Have you tried moving the whole visibility part of tuples out to a
> > separate heap ?
> >
> > Especially in OLAP/ETL scenarios the distribution of tuples loaded in
> > one transaction should be very good for visibility-info compression.
> >
> > I'd suspect that you could crush hundreds of pages worth of visibility
> > into single RLE encoding unit (xmin=N, xmax=no_yet, start_ctid = X,
> > end_ctid=Y), and it will stay in L1 cache most of the time you process
> > the corresponding relation. and the relation itself will be smaller, and
> > index-only (actually index-only + lookup inside L1 cache) access can
> > happen, and so on .
> >
> > OTOH, if you load it in millions of small transactions, you can run
> > VACUUM FREEZE _on_ the visibility heap only, which will make all
> > visibility infoe look similar and thus RLE-compressable and again make
> > it fit in L1 cache, if you dont have lots of failed loads interleaved
> > with successful ones.
> >
> > > Bitmap index with index-only access does provide nearly all of the
> > > advantages of a column store from a speed standpoint BTW.  Even though
> > > Vertica is touting speed advantages - our parallel engine plus bitmap
> > index
> > > will crush them in benchmarks when they show up with real code.
> > >
> > > Meanwhile they're moving on to new ideas - I kid you not "Horizontica"
> > is
> > > Dr. Stonebraker's new idea :-)
> >
> > Sounds like a result of a marketroid brainstorming session :P
> >
> > > So - bottom line - some ideas from column store make sense, but it's not
> > a
> > > cure-all.
> > >
> > > > There is also a MonetDB/X100 project, which tries to make MonetOD
>

Re: [PATCHES] Fix for win32 stat() problems

2008-04-10 Thread Magnus Hagander
Magnus Hagander wrote:
> Magnus Hagander wrote:
> > Attached is a patch that attempts to fix the issues with stat() not
> > properly updating st_size on win32, as reported in this thread:
> > http://archives.postgresql.org/pgsql-hackers/2008-03/msg01181.php
> > 
> > It has to have a chance to affect things beyond just the
> > pg_relation_size() function, so this patch fixes all cases where we
> > do stat() and use the st_size member. It doesn't change all
> > occurances of stat() since I didn't want to incur the double
> > filesystem lookups unnecessary cases.
> > 
> > Any objections?
> 
> Updated version. Seems the problem was that the includes came in the
> wrong order - figured that out just after I went to bed :) Here's an
> updated patch.
> 
> A whole lot simpler patch :-)

Appled with comment adjustment, and backpatched to 8.2 and 8.3.

//Magnus

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Fix for win32 stat() problems

2008-04-10 Thread Magnus Hagander
Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:
> > A whole lot simpler patch :-)
> 
> Seems like you no longer need the !defined(_DIRMOD_C) bit here?

Correct. That wasn't the actual error, that was me misdiagnosing the
situation.

> Also please include a comment about why  has to be
> forcibly included.

Will do.


> A more general question: can't we get rid of most of the #ifdef WIN32
> cruft in include/port.h, and put it in include/port/win32.h instead?
> Seems cleaner that way, at least for things where there's just an
> #ifdef WIN32 hunk and not two cases for Win and not-Win.

Yeah, that one has been on my TODO for a while. 

We may not be able to move everything because one file is included
early in the pass of c.h and one much later, but the majority of the
stuff we have there shouldn't care about which order it goes in at.

//Magnus

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Fix for win32 stat() problems

2008-04-10 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Andrew Dunstan wrote:
>> How about #defining safe_stat to be pg_win32_safe_stat on Windows and  
>> simply stat elsewhere? Then use safe_stat at the places you consider  
>> critical.

> I would couple this with a pgwin32_unsafe_stat on Windows, which changes
> the size value to 0, so that if anyone gets it wrong it's immediately
> obvious.

It's only worth having two versions if someone can show that there's
actually going to be a performance problem from the extra syscall.
I don't believe we use stat() in any place where it's really gonna
matter much ...

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Fix for win32 stat() problems

2008-04-10 Thread Tom Lane
Magnus Hagander <[EMAIL PROTECTED]> writes:
> A whole lot simpler patch :-)

Seems like you no longer need the !defined(_DIRMOD_C) bit here?
Also please include a comment about why  has to be
forcibly included.

A more general question: can't we get rid of most of the #ifdef WIN32
cruft in include/port.h, and put it in include/port/win32.h instead?
Seems cleaner that way, at least for things where there's just an
#ifdef WIN32 hunk and not two cases for Win and not-Win.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Fix for win32 stat() problems

2008-04-10 Thread Magnus Hagander
Magnus Hagander wrote:
> Attached is a patch that attempts to fix the issues with stat() not
> properly updating st_size on win32, as reported in this thread:
> http://archives.postgresql.org/pgsql-hackers/2008-03/msg01181.php
> 
> It has to have a chance to affect things beyond just the
> pg_relation_size() function, so this patch fixes all cases where we do
> stat() and use the st_size member. It doesn't change all occurances of
> stat() since I didn't want to incur the double filesystem lookups
> unnecessary cases.
> 
> Any objections?

Updated version. Seems the problem was that the includes came in the
wrong order - figured that out just after I went to bed :) Here's an
updated patch.

A whole lot simpler patch :-)

//MagnusIndex: include/port.h
===
RCS file: /projects/cvsroot/pgsql/src/include/port.h,v
retrieving revision 1.118
diff -c -r1.118 port.h
*** include/port.h	29 Feb 2008 15:31:33 -	1.118
--- include/port.h	10 Apr 2008 11:49:23 -
***
*** 298,303 
--- 298,310 
  #define popen(a,b) _popen(a,b)
  #define pclose(a) _pclose(a)
  
+ /* stat() is not guaranteed to update the size of the file */
+ extern int pgwin32_safestat(const char *path, struct stat *buf);
+ #if !defined(FRONTEND) && !defined(_DIRMOD_C)
+ #include 
+ #define stat(a,b) pgwin32_safestat(a,b)
+ #endif
+ 
  /* Missing rand functions */
  extern long lrand48(void);
  extern void srand48(long seed);
Index: port/dirmod.c
===
RCS file: /projects/cvsroot/pgsql/src/port/dirmod.c,v
retrieving revision 1.51
diff -c -r1.51 dirmod.c
*** port/dirmod.c	1 Jan 2008 19:46:00 -	1.51
--- port/dirmod.c	10 Apr 2008 12:06:35 -
***
*** 447,449 
--- 447,483 
  	pgfnames_cleanup(filenames);
  	return false;
  }
+ 
+ 
+ #ifdef WIN32
+ /*
+  * The stat() function in win32 is not guaranteed to update the st_size
+  * field when run. So we define our own version that uses the Win32 API
+  * to update this field.
+  */
+ #undef stat
+ int 
+ pgwin32_safestat(const char *path, struct stat *buf)
+ {
+ 	int r;
+ 	WIN32_FILE_ATTRIBUTE_DATA attr;
+ 
+ 	r = stat(path, buf);
+ 	if (r < 0)
+ 		return r;
+ 
+ 	if (!GetFileAttributesEx(path, GetFileExInfoStandard, &attr))
+ 	{
+ 		_dosmaperr(GetLastError());
+ 		return -1;
+ 	}
+ 
+ 	/*
+ 	 * XXX no support for large files here, but we don't do that in
+ 	 * general on Win32 yet.
+ 	 */
+ 	buf->st_size = attr.nFileSizeLow;
+ 
+ 	return 0;
+ }
+ #endif

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Fix for win32 stat() problems

2008-04-10 Thread Alvaro Herrera
Andrew Dunstan wrote:

> How about #defining safe_stat to be pg_win32_safe_stat on Windows and  
> simply stat elsewhere? Then use safe_stat at the places you consider  
> critical.

I would couple this with a pgwin32_unsafe_stat on Windows, which changes
the size value to 0, so that if anyone gets it wrong it's immediately
obvious.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Partial match in GIN

2008-04-10 Thread Teodor Sigaev
How about forcing the use of a bitmap index scan, and modify the indexam 
API so that GIN could a return a lossy bitmap, and let the bitmap heap 
scan do the rechecking?


Partial match might be used only for one search entry from many. In sext search 
example: 'a:* & qwertyuiop' - second lexeme has only a few matched tuples. But 
GIN itself doesn't know about semantic meaning of operation and can not 
distinguish following tsqueries:

'!a:* & qwertyuiop'
'!a:* & qwertyuiop'
'a:* & !qwertyuiop'

So, your suggestion is equivalent to mark all operation with RECHEK flag and 
OR-ing all posting lists. That will be give a lot of false match and too slow.





I don't think the storage size of tsquery matters much, so whatever 
is the best solution in terms of code readability etc.
That was about tsqueryesend/recv format? not a storage on disk. We 
don't require compatibility of binary format of db's files, but I have 
some doubts about binary dump.


We generally don't make any promises about cross-version compatibility 
of binary dumps, though it would be nice not to break it if it's not too 
much effort.


Hmm. match_special_index_operator() already checks that the index's 
opfamily is pattern_ops, or text_ops with C-locale. Are you reusing 
the same operator families for wildspeed? Doesn't it then also get 
confused if you do a "WHERE textcol > 'foo'" query by hand?

No, wildspeed use the same operator ~~
match_special_index_operator() isn't called at all: in 
match_clause_to_indexcol() function is_indexable_operator() is called 
before match_special_index_operator() and returns true.


expand_indexqual_opclause() sees that operation is a OID_TEXT_LIKE_OP 
and calls prefix_quals() which fails because it wishes only several 
Btree opfamilies.


Oh, I see. So this assumption mentioned in the comment there:

/*
 * LIKE and regex operators are not members of any index opfamily,
 * so if we find one in an indexqual list we can assume that it
 * was accepted by match_special_index_operator().
 */

is no longer true with wildspeed. So we do need to check that in 
expand_indexqual_opclause() then.


NOTICE 2: it seems to me, that similar technique could be 
implemented for ordinary BTree to eliminate hack around LIKE support.
LIKE expression. I wonder what the size and performance of that would 
be like, in comparison to the proposed GIN solution?


GIN speeds up '%foo%' too - which is impossible for btree. But I don't 
like a hack around LIKE support in BTree. This support uses outflank 
ways missing regular one.


You could satisfy '%foo%' using a regular and a reverse B-tree index, 
and a bitmap AND. Which is interestingly similar to the way you proposed 
to use a TIDBitmap within GIN.




--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] EXPLAIN progress info

2008-04-10 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> Gregory Stark <[EMAIL PROTECTED]> writes:
>> I think a better way to get a real "percentage done" would be to add a method
>> to each node which estimates its percentage done based on the percentage done
>> its children report and its actual and expected rows and its costs.
>
> You can spend a week inventing some complicated method, and the patch
> will be rejected because it adds too much overhead.  Anything we do here
> has to be cheap enough that no one will object to having it turned on
> all the time --- else it'll be useless exactly when they need it.

Actually Dave made a brilliant observation about this when I described it.
Most nodes can actually estimate their progress without any profiling overhead
at all. In fact they can do so more accurately than using the estimated rows.

Sequential scans, for example, can base a report on the actual block they're
on versus the previously measured end of the file. Bitmap heap scans can
report based on the number of blocks queued up to read.

Index scans are the obvious screw case. I think they would have to have a
counter that they increment on every tuple returned and reset to zero when
restarted. I can't imagine that's really a noticeable overhead though. Limit
and sort would also be a bit tricky.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches