[HACKERS] Support for JDBC setQueryTimeout, et al.

2010-10-10 Thread David Fetter
Folks,

Please find enclosed a WIP patch from one of my co-workers intended to
support JDBC's setQueryTimeout, along with the patch for JDBC that
uses it.

I think this is an especially handy capability, and goes to the number
one TODO on the JDBC compliance list.

http://jdbc.postgresql.org/todo.html

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index e4a7dd9..06c8fce 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -51,6 +51,8 @@
 /* GUC variables */
 intDeadlockTimeout = 1000;
 intStatementTimeout = 0;
+int SessionTimerTarget = 0;
+
 bool   log_lock_waits = false;
 
 /* Pointer to this process's PGPROC struct, if any */
@@ -1550,6 +1552,10 @@ CheckStatementTimeout(void)
/* Time to die */
statement_timeout_active = false;
cancel_from_timeout = true;
+
+/* reset session timer.  Never fire twice. */
+set_session_timer_ms(0);
+
 #ifdef HAVE_SETSID
/* try to signal whole process group */
kill(-MyProcPid, SIGINT);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index cba90a9..769ac2c 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2445,28 +2445,43 @@ exec_describe_portal_message(const char *portal_name)
pq_putemptymessage('n');/* NoData */
 }
 
-
 /*
  * Convenience routines for starting/committing a single command.
  */
 static void
 start_xact_command(void)
 {
+int64 timeout = 0;
+
if (!xact_started)
{
-   ereport(DEBUG3,
-   (errmsg_internal("StartTransactionCommand")));
+   ereport(DEBUG3, (errmsg_internal("StartTransactionCommand")));
StartTransactionCommand();
 
-   /* Set statement timeout running, if any */
-   /* NB: this mustn't be enabled until we are within an xact */
-   if (StatementTimeout > 0)
-   enable_sig_alarm(StatementTimeout, true);
-   else
-   cancel_from_timeout = false;
+if (timeout == 0 || 
+  (StatementTimeout > 0 && timeout > StatementTimeout)) {
+   timeout = StatementTimeout;
+}
+
+/* Set statement timeout running, if any */
+/* NB: this mustn't be enabled until we are within an xact */
+if (StatementTimeout > 0)
+   enable_sig_alarm(StatementTimeout, true);
+else
+   cancel_from_timeout = false;
 
xact_started = true;
-   }
+}
+
+timeout = get_session_timer_ms(); 
+if (timeout > 0) {
+   if (StatementTimeout == 0 || timeout < StatementTimeout) {
+  ereport(DEBUG3, (errmsg_internal("Enable an once session 
timer")));
+  enable_sig_alarm(timeout, true);
+   } else {
+  cancel_from_timeout = false;
+   }
+}
 }
 
 static void
diff --git a/src/backend/utils/adt/timestamp.c 
b/src/backend/utils/adt/timestamp.c
index c6e1d13..d998df6 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -4824,3 +4824,50 @@ generate_series_timestamptz(PG_FUNCTION_ARGS)
SRF_RETURN_DONE(funcctx);
}
 }
+
+/* Session timer target, in ms */
+static int64 sessionTimerTarget; 
+
+Datum set_session_timer(PG_FUNCTION_ARGS)
+{
+   int64 ms = PG_GETARG_INT64(0);
+   set_session_timer_ms(ms);
+   PG_RETURN_DATUM(0);
+}
+   
+#ifndef HAVE_INT64_TIMESTAMP
+#error Assumes int64 impl of timestamp
+#endif
+
+void set_session_timer_ms(int64 ms)
+{
+   if (ms == 0) {
+  sessionTimerTarget = 0;
+   } else {
+  /* GetCurrentTimestamp is in us */
+  int64 time_now = GetCurrentTimestamp() / 1000;
+  sessionTimerTarget = time_now + ms;
+   }
+}
+
+int64 get_session_timer_ms()
+{
+   int64 ret = 0;
+   if (sessionTimerTarget != 0) {
+  int64 time_now = GetCurrentTimestamp() / 1000;
+  ret = sessionTimerTarget - time_now;
+
+  if (ret <= 0) {
+ /* Timer already passed.  This maybe possible if some statement 
+  * set the timer, finished so the timer is disabled, next statement 
+  * set the timer again, but too late.  We want to go through the
+  * same routine and fire the timer.  Return 1 ms.
+  */ 
+ ret = 1;
+  }
+
+  sessionTimerTarget = 0;
+   }
+
+   return ret;
+}
diff --git a/src/include/catalog/c

Re: [HACKERS] patch: psql variables tabcomplete

2010-10-10 Thread Pavel Stehule
2010/10/11 Tom Lane :
> Pavel Stehule  writes:
>> 2010/10/4 Itagaki Takahiro :
>>> We don't have commands for display a list of such variables and \echo is
>>> not tab-completed even with the patch. "Only supported by \set" might be
>>> a bit unbalanced.
>
>> it's good idea. I looked on it - and it is job for other patch. Some
>> test are in experimental patch. But this needs more work - output is
>> little bit ugly - I thing so prefix and suffix should not be showed.
>
> I went ahead and applied this (with some cleanup).  I don't see a very
> good reason why the prefix/suffix shouldn't be shown in the completion
> data --- after all, that *is* what it's going to type for you.  In any
> case, preventing that would take some fundamental hacking of the
> readline interface; which would be way more work than it's worth,
> and probably not very portable across the different versions of
> readline either.  So I think we should just be happy with this
> behavior.

I write it before I looked to readline documentation - personally I
don't feel well from output, but the nice output isn't available with
current readline lib :( - so I agree with you. Thank you very much for
commit

Regards

Pavel

>
>                        regards, tom lane
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Which file does the SELECT?

2010-10-10 Thread David Christensen

On Oct 10, 2010, at 12:21 PM, Vaibhav Kaushal wrote:

> Thanks to both hitoshi and tom for your replies. 
> 
> I think I need to look into the Postgres code itself (I am better at code 
> than documentation). But since I have not been touch with C lately (these 
> days I am programming on PHP) I think I have forgot a few rules of game 
> (afterall PHP is so much more easy than C :P ). Moreover,  postgres is the 
> first Open Source software whose code I am interested in. I have never looked 
> into other OSS codes much except correcting a few compilation errors here and 
> there on beta / alpha releases. 
> 
> I have had the chance and success to compile my own Linux OS and it was fun 
> to do so... but I guess development is a tougher job. With an idea in mind, 
> and a thankful feeling towards postgres is what drives me to do this tougher 
> job.
> 
> When I was designing my database for a web app, I found so many problems in 
> MySQL that I could not continue (the best of all, I can't use the commands 
> written in my DB book to create a foreign key, it does not natively support 
> foreign keys, confusing storage engines and so on).. and then I got postgres 
> which I am a fan of.
> 
> I hope I will not be flamed when I will ask those questions (some of them are 
> actually very silly ones). 
> 
>  I will look inside the code now and will get back after i get some progress 
> with it.
> 
> However, I find too many references to the Data structure "datum" what is it 
> and where is it defined? Can someone tell me please? Also, what role does it 
> play?
> 
> Thanks to you all for your replies.
> 
> -Vaibhav

Depending on your text editor, you may be able to utilize TAGS files; see 
src/tools/make_(e|c)tags for creating TAGS files for your editor of choice 
(emacs/vim, although other editors may support specific formats).  This will 
allow you to navigate to the specific definition of the type/function/macro, 
and can be very enlightening and help answer some of these questions.  `git 
grep` will also come in handy if you're working directly from a git checkout.

Regards,

David
--
David Christensen
End Point Corporation
da...@endpoint.com





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] $libdir under linux

2010-10-10 Thread Tom Lane
Marios Vodas  writes:
> I want to create this function:
> CREATE OR REPLACE FUNCTION myfunction(cstring)
> RETURNS cstring AS
> '$libdir/mylib','myfunction'
> LANGUAGE 'C' IMMUTABLE STRICT;

> In windows this is working fine and $libdir is substituted by the actual
> path. In linux it is not substituted!

I rather doubt that, considering that it works fine for everybody else.

> This is the error I get:
> ERROR:  could not access file "$libdir/mylib": No such file or directory

So is mylib.so actually there in the library directory?  (Try
"pg_config --pkglibdir" to confirm which directory that is.)
If so, try ldd on it --- maybe the problem is with some library
it references, not mylib.so itself.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: psql variables tabcomplete

2010-10-10 Thread Tom Lane
Pavel Stehule  writes:
> 2010/10/4 Itagaki Takahiro :
>> We don't have commands for display a list of such variables and \echo is
>> not tab-completed even with the patch. "Only supported by \set" might be
>> a bit unbalanced.

> it's good idea. I looked on it - and it is job for other patch. Some
> test are in experimental patch. But this needs more work - output is
> little bit ugly - I thing so prefix and suffix should not be showed.

I went ahead and applied this (with some cleanup).  I don't see a very
good reason why the prefix/suffix shouldn't be shown in the completion
data --- after all, that *is* what it's going to type for you.  In any
case, preventing that would take some fundamental hacking of the
readline interface; which would be way more work than it's worth,
and probably not very portable across the different versions of
readline either.  So I think we should just be happy with this
behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] $libdir under linux

2010-10-10 Thread Marios Vodas
I want to create this function:
CREATE OR REPLACE FUNCTION myfunction(cstring)
RETURNS cstring AS
'$libdir/mylib','myfunction'
LANGUAGE 'C' IMMUTABLE STRICT;

In windows this is working fine and $libdir is substituted by the actual
path. In linux it is not substituted!
This is the error I get:
ERROR:  could not access file "$libdir/mylib": No such file or directory
SQL state: 58P01

How do I fix it?


Re: [HACKERS] Debugging initdb breakage

2010-10-10 Thread Tom Lane
Dimitri Fontaine  writes:
> How to have gdb help me? What's my error, that I guess is obvious?

Might have something to do with using the same OID for the catalog
and its index ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-10-10 Thread Robert Haas
On Oct 10, 2010, at 12:57 AM, Gurjeet Singh  wrote:
> On Tue, Sep 21, 2010 at 12:55 AM, Robert Haas  wrote:
> All the build products in a normal build.  One of the infelicities of
> git is that 'git status' shows the untracked files at the bottom.  So
> if you have lots of unignored stuff floating around, the information
> about which files you've actually changed or added to the index
> scrolls right off the screen.
> 
> I have just started to read the thread, so don't know if somebody already 
> chimed in.
> 
> You can avoid the above situation by adding the folowing to ~/.gitconfig or 
> .git/config
> 
> [pager]
> status = true
> 
> I think I used `git config` command for this, but adding by hand would also 
> do.

Oh, dude.  Awesome.

...Robert

Re: [HACKERS] Debugging initdb breakage

2010-10-10 Thread Heikki Linnakangas

On 10.10.2010 23:38, Dimitri Fontaine wrote:

So I'm beginning to work on the extension support for dump and restore,
and that begins with a new pg_extension catalog. I managed to break
initdb already, of course, but I'm fighting my way out — no luck with
gdb, it won't catch the Assert failure and show me a backtrace. I tried
"set follow-fork-mode child" in gdb, in different systems, to no avail.

Please find attached the detailed initdb.log and the very few items I
needed to obtain:

TRAP: FailedAssertion("!(reln->md_fd[forkNum] == ((void *)0))", File: "md.c", 
Line: 254)
child process was terminated by signal 6: Abort trap

How to have gdb help me? What's my error, that I guess is obvious? Where
do I read more now in order not to need too much assistance after that?


At least on my system, assertion failure creates a core dump that you 
can load in gdb. Make sure you use "ulimit -c unlimited" or similar.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wip: functions median and percentile

2010-10-10 Thread Tom Lane
Dean Rasheed  writes:
> In the meantime, the attached variation of the patch fixes the temp
> file issue and will support all 3 cases. It gives OK performance for
> (1) and (2), and poor performance for (3). That could be viewed as a
> future development task, which perhaps the window function API would
> help with. I think it would be a shame to drop support for (2) just
> because we can't do (3) efficiently yet.

I started looking at this patch, and noticed that we got so caught up
in implementation issues that we forgot the unresolved problem of data
types.  The original patch defined median(anyelement), which is only
well-defined for an odd number of inputs; for an even number of inputs
you have to take the left or right item in the central pair.  I see
that this version defines
median(float8) returns float8
median(float4) returns float8
median(numeric) returns numeric
median(int8) returns numeric
median(int4) returns numeric
median(int2) returns numeric
which strikes me as possibly being overkill.

It was pointed out upthread that while median isn't presently
in the standard, Oracle defines it in terms of percentile_cont(0.5)
which *is* in the standard.  What I read in SQL:2008 is that
percentile_cont is defined for all numeric types (returning
approximate numeric with implementation-defined precision),
and for interval (returning interval), and not for any other
input type.  So it appears to me that what we ought to support
is
median(float8) returns float8
median(interval) returns interval
and nothing else --- we can rely on implicit casting to convert
any other numeric input type to float8.

BTW, as far as the implementation issues go, telling tuplesort that it
can use gigabytes of memory no matter what seems quite unacceptable.
Put this thing into a hash aggregation and you'll blow out your memory
in no time.  I don't think it's even a good idea to use work_mem there.
I wonder whether it'd be a good idea to augment AggCheckCallContext()
so that there's a way for aggregates to find out how much memory they
ought to try to use.  In a simple aggregation situation it's probably
OK to use work_mem, but in a hash aggregation you'd better use less
--- perhaps work_mem divided by the number of groups expected.

Also, I believe that the lack-of-cleanup problem for tuplesorts spilling
to disk should be fixable by using an exprcontext shutdown callback (see
RegisterExprContextCallback).

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Which file does the SELECT?

2010-10-10 Thread Martijn van Oosterhout
On Sun, Oct 10, 2010 at 10:51:54PM +0530, Vaibhav Kaushal wrote:
> However, I find too many references to the Data structure "datum" what is it
> and where is it defined? Can someone tell me please? Also, what role does it
> play?

"Datum" is the singular form of "data". It refers to a single item of
any type. So it may be an integer, text type, geometry type, anything.
A row is a list of datums. (A datum can also be a composite type).

Normally in the planner code you don't need to worry to much about what
it explicitly refers to, but if you do, you need to know the type of a
datum before you can manipulate it. The type is not stored inside the
datum.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first. 
>   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] WIP: Triggers on VIEWs

2010-10-10 Thread Tom Lane
Bernd Helmle  writes:
> --On 8. September 2010 09:00:33 +0100 Dean Rasheed 
>  wrote:
>> Here's an updated version with improved formatting and a few minor
>> wording changes to the triggers chapter.

> This version doesn't apply clean anymore due to some rejects in 
> plainmain.c. Corrected version attached.

Applied with revisions.  A couple of points worth remarking:

* I took out this change in planmain.c:
  
+   /*
+* If the query target is a VIEW, it won't be in the jointree, but we
+* need a dummy RelOptInfo node for it. This need not have any stats in
+* it because it always just goes at the top of the plan tree.
+*/
+   if (parse->resultRelation &&
+   root->simple_rel_array[parse->resultRelation] == NULL)
+   build_simple_rel(root, parse->resultRelation, 
RELOPT_OTHER_MEMBER_REL);

AFAICT that's just dead code: the only reason to build such a rel would
be if there were Vars referencing it in the main part of the plan tree.
But there aren't.  Perhaps this was left over from some early iteration
of the patch before you had the Var numbering done right?  Do you know
of any cases where it's still needed?

* I also took out the changes in preprocess_targetlist() that tried to
prevent equivalent wholerow vars from getting entered in the targetlist.
This would not work as intended since the executor has specific
expectations for the names of those junk TLEs; it'd fail if it ever
actually tried to do an EvalPlanQual recheck that needed those TLEs.
Now I believe that an EPQ recheck is impossible so far as the update or
delete itself is concerned, when the target is a view.  So if you were
really concerned about the extra vars, the non-kluge route to a solution
would be to avoid generating RowMarks in the first place.  You'd have to
think a bit about the possibility of SELECT FOR UPDATE in sub-selects
though; the query as a whole might need some rowmarks even if the top
level Modify node doesn't.  On the whole I couldn't get excited about
this issue, so I just left it alone.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Which file does the SELECT?

2010-10-10 Thread Vaibhav Kaushal
Thanks to both hitoshi and tom for your replies.

I think I need to look into the Postgres code itself (I am better at code
than documentation). But since I have not been touch with C lately (these
days I am programming on PHP) I think I have forgot a few rules of game
(afterall PHP is so much more easy than C :P ). Moreover,  postgres is the
first Open Source software whose code I am interested in. I have never
looked into other OSS codes much except correcting a few compilation errors
here and there on beta / alpha releases.

I have had the chance and success to compile my own Linux OS and it was fun
to do so... but I guess development is a tougher job. With an idea in mind,
and a thankful feeling towards postgres is what drives me to do this tougher
job.

When I was designing my database for a web app, I found so many problems in
MySQL that I could not continue (the best of all, I can't use the commands
written in my DB book to create a foreign key, it does not natively support
foreign keys, confusing storage engines and so on).. and then I got postgres
which I am a fan of.

I hope I will not be flamed when I will ask those questions (some of them
are actually very silly ones).

 I will look inside the code now and will get back after i get some progress
with it.

However, I find too many references to the Data structure "datum" what is it
and where is it defined? Can someone tell me please? Also, what role does it
play?

Thanks to you all for your replies.

-Vaibhav



On Sun, Oct 10, 2010 at 9:29 PM, Tom Lane  wrote:

> Vaibhav Kaushal  writes:
> > Can someone tell me what are 'Join Pairs with no Join clause' ? I am not
> > able to figure that out!
>
> Consider
>
>select * from t1, t2, t3 where t1.a = t2.x and t1.b = t3.y;
>
> In theory this query could be done by first joining t2 and t3, then
> joining that to t1.  But the planner won't investigate the possibility
> because the t2/t3 join would have to be a cartesian product join:
> there's no WHERE clause relating them.
>
> On the other hand, if we have
>
>select * from t1, t2, t3 where t1.a = t2.x and t1.a = t3.y;
>
> then the planner is able to infer the additional join clause t2.x =
> t3.y, so it will consider that join sequence.
>
>regards, tom lane
>


Re: [HACKERS] Which file does the SELECT?

2010-10-10 Thread Tom Lane
Vaibhav Kaushal  writes:
> Can someone tell me what are 'Join Pairs with no Join clause' ? I am not
> able to figure that out!

Consider

select * from t1, t2, t3 where t1.a = t2.x and t1.b = t3.y;

In theory this query could be done by first joining t2 and t3, then
joining that to t1.  But the planner won't investigate the possibility
because the t2/t3 join would have to be a cartesian product join:
there's no WHERE clause relating them.

On the other hand, if we have

select * from t1, t2, t3 where t1.a = t2.x and t1.a = t3.y;

then the planner is able to infer the additional join clause t2.x =
t3.y, so it will consider that join sequence.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Which file does the SELECT?

2010-10-10 Thread Vaibhav Kaushal
The PostgreSQL documentation (9.0.1) has the following section in section
44.5.1:

 The planner preferentially considers joins between any two relations for
which
there exist a corresponding join clause in the WHERE qualification (i.e., for
which a restriction like
where rel1.attr1=rel2.attr2 exists). *Join pairs with no join clause are
considered only when*
there is no other choice, that is, a particular relation has no available
join clauses to any other relation.
All possible plans are generated for every join pair considered by the
planner, and the one that is
(estimated to be) the cheapest is chosen.

Can someone tell me what are 'Join Pairs with no Join clause' ? I am not
able to figure that out!

-Vaibhav (*_*)

On Sun, Oct 10, 2010 at 1:58 PM, Vaibhav Kaushal <
vaibhavkaushal...@gmail.com> wrote:

> Thanks for the reply.
>
> So if I am not wrong, I will have to understand the whole querying process
> in detail? If it is so, then where do I start from?
>
> -Vaibhav
>
>
> On Sun, Oct 10, 2010 at 1:41 PM, Peter Eisentraut  wrote:
>
>> On sön, 2010-10-10 at 13:32 +0530, Vaibhav Kaushal wrote:
>> > I have gone through the source code a bit but I wanted to know that
>> > which file contains the code that performs the final SLECTION after
>> > the optimizer has created the final plan? I mean which part of the
>> > executor is responsible for the SELCT to be run?
>>
>> That depends on what plan was chosen for the SELECT, since the executor
>> is primarily organized by plan node type, independent of which statement
>> caused the plan to be generated.
>>
>>
>


Re: [HACKERS] levenshtein_less_equal (was: multibyte charater set in levenshtein function)

2010-10-10 Thread Alexander Korotkov
Sorry, I'm pretty *unconversant in git. Now, it should be ok.*


With best regards,
Alexander Korotkov.
*** a/contrib/fuzzystrmatch/fuzzystrmatch.c
--- b/contrib/fuzzystrmatch/fuzzystrmatch.c
***
*** 61,66  PG_MODULE_MAGIC;
--- 61,68 
   */
  extern Datum levenshtein_with_costs(PG_FUNCTION_ARGS);
  extern Datum levenshtein(PG_FUNCTION_ARGS);
+ extern Datum levenshtein_less_equal_with_costs(PG_FUNCTION_ARGS);
+ extern Datum levenshtein_less_equal(PG_FUNCTION_ARGS);
  extern Datum metaphone(PG_FUNCTION_ARGS);
  extern Datum soundex(PG_FUNCTION_ARGS);
  extern Datum difference(PG_FUNCTION_ARGS);
***
*** 92,98  soundex_code(char letter)
  #define MAX_LEVENSHTEIN_STRLEN		255
  
  static int levenshtein_internal(text *s, text *t,
! 	 int ins_c, int del_c, int sub_c);
  
  
  /*
--- 94,100 
  #define MAX_LEVENSHTEIN_STRLEN		255
  
  static int levenshtein_internal(text *s, text *t,
! 	 int ins_c, int del_c, int sub_c, int max_d);
  
  
  /*
***
*** 202,211  rest_of_char_same(const char *s1, const char *s2, int len)
   *		  between supplied strings. Generally
   *		  (1, 1, 1) penalty costs suffices common
   *		  cases, but your mileage may vary.
   */
  static int
  levenshtein_internal(text *s, text *t,
! 	 int ins_c, int del_c, int sub_c)
  {
  	int			m,
  n,
--- 204,217 
   *		  between supplied strings. Generally
   *		  (1, 1, 1) penalty costs suffices common
   *		  cases, but your mileage may vary.
+  *		  Returns accurate value if max_d < 0 or
+  *		  actual distance is less or equal than
+  *		  max_d, otherwise returns value greater
+  *		  than max_d.
   */
  static int
  levenshtein_internal(text *s, text *t,
! 	 int ins_c, int del_c, int sub_c, int max_d)
  {
  	int			m,
  n,
***
*** 219,224  levenshtein_internal(text *s, text *t,
--- 225,233 
  	const char *s_data;
  	const char *t_data;
  	const char *y;
+ 	const char *prev_x = NULL;
+ 	intmin_i = 0, max_i = 0, d;
+ 	intdelta = 0, min_d = 0, theor_max_d;
  
  	/* Extract a pointer to the actual character data. */
  	s_data = VARDATA_ANY(s);
***
*** 240,245  levenshtein_internal(text *s, text *t,
--- 249,268 
  		return m * del_c;
  
  	/*
+ 	 * There is theoretical maximum distance based of string lengths. It
+ 	 * represents the case, when no characters are matching. If max_d
+ 	 * reaches this value than we can use accurate calculation of distance
+ 	 * which is faster in this case.
+ 	 */
+ 	if (max_d >= 0)
+ 	{
+ 		theor_max_d = Min(m*del_c + n*ins_c, (m > n) ?
+ 			(n * sub_c + (m - n) * del_c):(m * sub_c + (n - m) * ins_c)) - 1;
+ 		if (max_d >= theor_max_d)
+ 			max_d = -1;
+ 	}
+ 
+ 	/*
  	 * For security concerns, restrict excessive CPU+RAM usage. (This
  	 * implementation uses O(m) memory and has O(mn) complexity.)
  	 */
***
*** 251,256  levenshtein_internal(text *s, text *t,
--- 274,296 
  		MAX_LEVENSHTEIN_STRLEN)));
  
  	/*
+ 	 * We can find the minimal distance by the difference of string lengths.
+ 	 */
+ 	if (max_d >= 0)
+ 	{
+ 		delta = m - n;
+ 		if (delta > 0)
+ 			min_d = delta * del_c;
+ 		else if (delta < 0)
+ 			min_d = - delta * ins_c;
+ 		else
+ 			min_d = 0;
+ 
+ 		if (min_d > max_d)
+ 			return max_d + 1;
+ 	}
+ 
+ 	/*
  	 * In order to avoid calling pg_mblen() repeatedly on each character in s,
  	 * we cache all the lengths before starting the main loop -- but if all the
  	 * characters in both strings are single byte, then we skip this and use
***
*** 298,309  levenshtein_internal(text *s, text *t,
  	 */
  	for (i = 0; i < m; i++)
  		prev[i] = i * del_c;
  
  	/* Loop through rows of the notional array */
  	for (y = t_data, j = 1; j < n; j++)
  	{
  		int		   *temp;
- 		const char *x = s_data;
  		int			y_char_len = n != t_bytes + 1 ? pg_mblen(y) : 1;
  
  		/*
--- 338,372 
  	 */
  	for (i = 0; i < m; i++)
  		prev[i] = i * del_c;
+ 	/*
+ 	 * If we have limitation of max_d, than we'll maintain [min_i; max_i]
+ 	 * interval, which bound cells, where sum of cell value and smallest
+ 	 * possible residual cost is less or equal to max_d (we don't include
+ 	 * 0 index into this interval). Residual cost represent cost of insertions
+ 	 * or deletions for moving to diagonal, which containing bottom right cell.
+ 	 * The sum value saves important property of original matrix, that this
+ 	 * sum for cell always greater or equal than such sums for cells, which are
+ 	 * used for it's calculation. That's why this sum can be used for bound
+ 	 * interval.
+ 	 */
+ 	if (max_d >= 0)
+ 	{
+ 		min_i = 1;
+ 		max_i = 1;
+ 		while (max_i < m && prev[max_i] +
+ 			((delta - max_i > 0) ? (delta - max_i) * del_c :
+ 			(-delta + max_i) * ins_c) <= max_d)
+ 		{
+ 			max_i++;
+ 		}
+ 		max_i--;
+ 		prev_x = s_data;
+ 	}
  
  	/* Loop through rows of the notional array */
  	for (y = t_data, j = 1; j < n; j++

Re: [HACKERS] Which file does the SELECT?

2010-10-10 Thread Tom Lane
Vaibhav Kaushal  writes:
> So if I am not wrong, I will have to understand the whole querying process
> in detail? If it is so, then where do I start from?

If you haven't seen it already, this is a good place to start:
http://developer.postgresql.org/pgdocs/postgres/overview.html

There's also some stuff in src/tools/backend/ that tries to summarize
which subdirectories of the source tree do what.

Also, many subsystems have README files in the source tree.
It sounds like you might like to look at src/backend/executor/README.

But in the end there's no substitute for reading source code...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Which file does the SELECT?

2010-10-10 Thread Hitoshi Harada
2010/10/10 Vaibhav Kaushal :
> Thanks for the reply.
> So if I am not wrong, I will have to understand the whole querying process
> in detail? If it is so, then where do I start from?
> -Vaibhav
>
> On Sun, Oct 10, 2010 at 1:41 PM, Peter Eisentraut  wrote:
>>
>> On sön, 2010-10-10 at 13:32 +0530, Vaibhav Kaushal wrote:
>> > I have gone through the source code a bit but I wanted to know that
>> > which file contains the code that performs the final SLECTION after
>> > the optimizer has created the final plan? I mean which part of the
>> > executor is responsible for the SELCT to be run?
>>
>> That depends on what plan was chosen for the SELECT, since the executor
>> is primarily organized by plan node type, independent of which statement
>> caused the plan to be generated.
>>
> So if I am not wrong, I will have to understand the whole querying process
> in detail? If it is so, then where do I start from?
>

And it depends on what you are interested in. If the executor behavior
is your interest, see execMain.c, but your question looks more
interested in nodeSeqscan.c which scans rows from a relation.

Regards,



-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Fix snapshot taking inconsistencies

2010-10-10 Thread Marko Tiikkaja

On 2010-10-07 5:21 AM +0300, Steve Singer wrote:

Since no one else has proposed a better idea and the commit fest is ticking
away I think you should go ahead and do that.


Here's a new version of the patch, deprecating pg_parse_and_rewrite.

I duplicated the parse/rewrite logic in the two places where 
pg_parse_and_rewrite is currently used, per comment from Tom.



Regards,
Marko Tiikkaja
*** a/src/backend/catalog/pg_proc.c
--- b/src/backend/catalog/pg_proc.c
***
*** 755,761  fmgr_sql_validator(PG_FUNCTION_ARGS)
--- 755,763 
Oid funcoid = PG_GETARG_OID(0);
HeapTuple   tuple;
Form_pg_proc proc;
+   List   *raw_parsetree_list;
List   *querytree_list;
+   ListCell   *list_item;
boolisnull;
Datum   tmp;
char   *prosrc;
***
*** 828,836  fmgr_sql_validator(PG_FUNCTION_ARGS)
 */
if (!haspolyarg)
{
!   querytree_list = pg_parse_and_rewrite(prosrc,
!   
  proc->proargtypes.values,
!   
  proc->pronargs);
(void) check_sql_fn_retval(funcoid, proc->prorettype,
   
querytree_list,
   
NULL, NULL);
--- 830,854 
 */
if (!haspolyarg)
{
!   /*
!* Parse and rewrite the queries in the function text.
!*
!* Even though check_sql_fn_retval is only interested 
in the last
!* query, we analyze all of them here to check for any 
errors.
!*/
!   raw_parsetree_list = pg_parse_query(prosrc);
!   
!   querytree_list = NIL;
!   foreach(list_item, raw_parsetree_list)
!   {
!   Node *parsetree = (Node *) lfirst(list_item);
! 
!   querytree_list = 
pg_analyze_and_rewrite(parsetree, prosrc,
!   
proc->proargtypes.values, proc->pronargs);
!   }
! 
!   Assert(querytree_list != NIL);
! 
(void) check_sql_fn_retval(funcoid, proc->prorettype,
   
querytree_list,
   
NULL, NULL);
*** a/src/backend/executor/functions.c
--- b/src/backend/executor/functions.c
***
*** 90,107  typedef struct
ParamListInfo paramLI;  /* Param list representing current args 
*/
  
Tuplestorestate *tstore;/* where we accumulate result tuples */
  
JunkFilter *junkFilter; /* will be NULL if function returns 
VOID */
  
!   /* head of linked list of execution_state records */
!   execution_state *func_state;
  } SQLFunctionCache;
  
  typedef SQLFunctionCache *SQLFunctionCachePtr;
  
  
  /* non-export function prototypes */
! static execution_state *init_execution_state(List *queryTree_list,
 SQLFunctionCachePtr fcache,
 bool lazyEvalOK);
  static void init_sql_fcache(FmgrInfo *finfo, bool lazyEvalOK);
--- 90,107 
ParamListInfo paramLI;  /* Param list representing current args 
*/
  
Tuplestorestate *tstore;/* where we accumulate result tuples */
+   Snapshotsnapshot;
  
JunkFilter *junkFilter; /* will be NULL if function returns 
VOID */
  
!   List *func_state;
  } SQLFunctionCache;
  
  typedef SQLFunctionCache *SQLFunctionCachePtr;
  
  
  /* non-export function prototypes */
! static List *init_execution_state(List *queryTree_list,
 SQLFunctionCachePtr fcache,
 bool lazyEvalOK);
  static void init_sql_fcache(FmgrInfo *finfo, bool lazyEvalOK);
***
*** 123,183  static void sqlfunction_destroy(DestReceiver *self);
  
  
  /* Set up the list of per-query execution_state records for a SQL function */
! static execution_state *
  init_execution_state(List *queryTree_list,
 SQLFunctionCachePtr fcache,
 bool lazyEvalOK)
  {
!   execution_state *firstes = NULL;
!   execution_state *preves = NULL;
execution_state *lasttages = NULL;
!   ListCell   *qt

Re: [HACKERS] Which file does the SELECT?

2010-10-10 Thread Vaibhav Kaushal
Thanks for the reply.

So if I am not wrong, I will have to understand the whole querying process
in detail? If it is so, then where do I start from?

-Vaibhav

On Sun, Oct 10, 2010 at 1:41 PM, Peter Eisentraut  wrote:

> On sön, 2010-10-10 at 13:32 +0530, Vaibhav Kaushal wrote:
> > I have gone through the source code a bit but I wanted to know that
> > which file contains the code that performs the final SLECTION after
> > the optimizer has created the final plan? I mean which part of the
> > executor is responsible for the SELCT to be run?
>
> That depends on what plan was chosen for the SELECT, since the executor
> is primarily organized by plan node type, independent of which statement
> caused the plan to be generated.
>
>


Re: [HACKERS] Which file does the SELECT?

2010-10-10 Thread Peter Eisentraut
On sön, 2010-10-10 at 13:32 +0530, Vaibhav Kaushal wrote:
> I have gone through the source code a bit but I wanted to know that
> which file contains the code that performs the final SLECTION after
> the optimizer has created the final plan? I mean which part of the
> executor is responsible for the SELCT to be run? 

That depends on what plan was chosen for the SELECT, since the executor
is primarily organized by plan node type, independent of which statement
caused the plan to be generated.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Which file does the SELECT?

2010-10-10 Thread Vaibhav Kaushal
I have gone through the source code a bit but I wanted to know that which
file contains the code that performs the final SLECTION after the optimizer
has created the final plan? I mean which part of the executor is responsible
for the SELCT to be run?

Can someone tell me the file which governs it?

Thanks for any help in advance.