[HACKERS] dot to be considered as a word delimiter?
Currently it seems like that dot is not considered as a word delimiter by the english parser. lawdb=# select to_tsvector('english', 'Mr.J.Sai Deepak'); to_tsvector - 'deepak':2 'mr.j.sai':1 (1 row) So the word obtained is mr.j.sai rather than three words mr, j, sai It does it correctly if there is space in between, as space is definitely a word delimiter. lawdb=# select to_tsvector('english', 'Mr. J. Sai Deepak'); to_tsvector - 'j':2 'mr':1 'sai':3 'deepak':4 (1 row) I think that dot should be considered by as a word delimiter because when dot is not followed by a space, most of the time it is an error in typing. Beside they are not many valid english words that have dot in between. -Sushant. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] search_path improvements WAS: search_path vs extensions
On May 29, 2009, at 5:16 PM, Greg Stark wrote: On Fri, May 29, 2009 at 11:03 PM, David E. Wheeler da...@kineticode.com wrote: On May 29, 2009, at 2:52 PM, Josh Berkus wrote: a) the ability to push a schema onto the current search path b) the ability to pull a schema off the current search path push, pop, shift, unshift. :-) Come to think of it, I want these for arrays, too. ;-) push and unshift sure -- and you do have those for arrays, it's spelled ||. Well, no, not quite, as push, pop, shift, and unshift change arrays in place, whereas || (and array_concat() and array_append() and array_prepend()) create and return a new array. I'm not so sure about pop/shift though. How would you know the element you want is at the beginning/end unless you just put it there? Well, for arrays, I might use them in a function just as I currently use them for various things in Perl. For search_path, yeah, I'd use them for lexical scoping: unshift a path onto the search path at the beginning of the function and shift it off at the end. Or push it onto the end of the search path and pop it off at the end. I think what you really want is to use SET LOCAL or RESET to restore it to whatever it was before you started futzing with it. We might need a more flexible way to do that that isn't tied to transactions though. Well, lexical scoping for the changes would help, for sure. And I think that RESET is fine, but it would be more useful to have ways to push and unshift, or even to splice (sometimes I might want to remove a schema from the search path no matter where it appears in the path). Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
Bruce Momjian wrote: Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: It would be nice to have pg_migrator handle this, especially if we could do it in parallel. Then we just have to warn users that migrating a database with tsvector columns takes significantly longer. That is, 1) do rest of catalog swap and link/copy of objects. 2) mark all tsvector columns as 83_tsvector and add new tsvector type (these columns will be unusable for queries) 3) bring up database 4) search for all 83_tsvector columns 5) do ALTER TABLE on each of these columns, in parallel, up to a configuration setting (default 3). pg_migrator is already emitting a script that is intended to be run after conversion, to handle REINDEXing of incompatible indexes. That could easily be made to do ALTER TYPE on old tsvector columns too, no? Hmmm. Well, the problem right now with v8_3_tsvector is that it fails on index create, even when the index is empty, because I can't figure out how to simply set up the proper index catalog entries. Once that is fixed and I can bind tsvector to v8_3_tsvector on schema creation, I can easily emit ALTER TABLE to fix the issue. And, at that point the tsvector GIN indexes would be automatically created so I can skip that part. I have discovered a simpler solution using ALTER TABLE and calling a conversion function: test= CREATE TABLE tsvector_test(x tsvector); CREATE TABLE test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector test- USING conversion_func(x); ALTER TABLE No need for a fake data type and the required index infrastructure. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
On Sat, May 30, 2009 at 1:11 PM, Bruce Momjian br...@momjian.us wrote: I have discovered a simpler solution using ALTER TABLE and calling a conversion function: test= CREATE TABLE tsvector_test(x tsvector); CREATE TABLE test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector test- USING conversion_func(x); ALTER TABLE No need for a fake data type and the required index infrastructure. I assume you're putting this in the list of commands to run post-migration along with any reindex commands etc? Because it will take a while (still faster than dump/reload i think). For this case, assuming the new tsvector's output function doesn't get confused by the old ordering, I think you can just use USING x::text::tsvector as your conversion expression. For more complex cases you might need to package up the old output function. Also note that you'll want to do any other conversions in the same table at the same time rather than doing multiple conversions. Also, one gotcha to note is that tsvector data can appear inside composite data types or arrays. I don't think that's common so perhaps just a warning in the readme would suffice, but it's something to note at least. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
Hi, Le 30 mai 09 à 16:02, Greg Stark a écrit : On Sat, May 30, 2009 at 1:11 PM, Bruce Momjian br...@momjian.us wrote: I have discovered a simpler solution using ALTER TABLE and calling a conversion function: test= CREATE TABLE tsvector_test(x tsvector); CREATE TABLE test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector test- USING conversion_func(x); ALTER TABLE No need for a fake data type and the required index infrastructure. I assume you're putting this in the list of commands to run post-migration along with any reindex commands etc? Because it will take a while (still faster than dump/reload i think). Just thinking some more about the idea to get all those post- processing steps running in parallel, it's occurring to me that we have all we need already: would it be possible for pg_migrator to issue a schema only script with a catalog, in the custom archive format? Then we could use pg_restore -j whatever post_migrator.script to run the last migration step. Of course, people will want the custom script output of pg_migrator to be optional, I guess. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
Bruce Momjian br...@momjian.us writes: I have discovered a simpler solution using ALTER TABLE and calling a conversion function: test= CREATE TABLE tsvector_test(x tsvector); CREATE TABLE test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector test- USING conversion_func(x); ALTER TABLE No need for a fake data type and the required index infrastructure. I think this is basically a large-caliber foot gun. You're going to pretend that invalid data is valid, until the user gets around to fixing it? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Clean shutdown and warm standby
On Fri, 2009-05-29 at 21:16 -0300, Euler Taveira de Oliveira wrote: Simon Riggs escreveu: And for them, it hasn't been completely fixed. That point was not made by patch author or committer, leaving the impression it was now completely safe, which, I truly regret to say, is not correct. Simon, could you point out what the patch does not do? If we can't fix it now, at least we add it to TODO. I already did, on my first post on this thread. I won't repeat myself, so that we can avoid restarting the discussion; forgive me. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
Dimitri Fontaine wrote: Hi, Le 30 mai 09 ? 16:02, Greg Stark a ?crit : On Sat, May 30, 2009 at 1:11 PM, Bruce Momjian br...@momjian.us wrote: I have discovered a simpler solution using ALTER TABLE and calling a conversion function: test= CREATE TABLE tsvector_test(x tsvector); CREATE TABLE test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector test- USING conversion_func(x); ALTER TABLE No need for a fake data type and the required index infrastructure. I assume you're putting this in the list of commands to run post-migration along with any reindex commands etc? Because it will take a while (still faster than dump/reload i think). Yes. Just thinking some more about the idea to get all those post- processing steps running in parallel, it's occurring to me that we have all we need already: would it be possible for pg_migrator to issue a schema only script with a catalog, in the custom archive format? Then we could use pg_restore -j whatever post_migrator.script to run the last migration step. Of course, people will want the custom script output of pg_migrator to be optional, I guess. Yes, eventually that is possible. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: I have discovered a simpler solution using ALTER TABLE and calling a conversion function: test= CREATE TABLE tsvector_test(x tsvector); CREATE TABLE test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector test- USING conversion_func(x); ALTER TABLE No need for a fake data type and the required index infrastructure. I think this is basically a large-caliber foot gun. You're going to pretend that invalid data is valid, until the user gets around to fixing it? What choice do we have? While we can mark indexes as invalid (which we do), how do we mark a table's contents as invalid? Should we create rules so no one can see the data and then have the ALTER TABLE script remove the rules after it is rebuilt? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
Bruce Momjian wrote: Just thinking some more about the idea to get all those post- processing steps running in parallel, it's occurring to me that we have all we need already: would it be possible for pg_migrator to issue a schema only script with a catalog, in the custom archive format? Then we could use pg_restore -j whatever post_migrator.script to run the last migration step. Of course, people will want the custom script output of pg_migrator to be optional, I guess. Yes, eventually that is possible. I have added that as a pg_migrator TODO item. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian br...@momjian.us writes: I have discovered a simpler solution using ALTER TABLE and calling a conversion function: test= CREATE TABLE tsvector_test(x tsvector); CREATE TABLE test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector test- USING conversion_func(x); ALTER TABLE No need for a fake data type and the required index infrastructure. I think this is basically a large-caliber foot gun. You're going to Uh, isn't pg_migrator itself a large-caliber foot gun? No need to call just part of it a foot gun. ;-) -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bytea vs. pg_dump
--On Samstag, Mai 30, 2009 00:47:16 +0300 Hannu Krosing ha...@2ndquadrant.com wrote: And we can also escape the need to uncompress TOAST'ed fields - just markup the compression as another \c at the beginning of data. Hmm i thought about that, but that seems only to make sense if there is an easy way to bypass compressing the data on restore. Also, it seems to me that compression/decompression isn't a real bottleneck, but that needs to be confirmed. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bytea vs. pg_dump
--On Freitag, Mai 29, 2009 11:06:28 +0300 Peter Eisentraut pete...@gmx.net wrote: Btw., I have started to write some code for that. Cool. Let me know if i can help out somewhere. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Clean shutdown and warm standby
Simon Riggs escreveu: And for them, it hasn't been completely fixed. That point was not made by patch author or committer, leaving the impression it was now completely safe, which, I truly regret to say, is not correct. Simon, could you point out what the patch does not do? If we can't fix it now, at least we add it to TODO. -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
Bruce Momjian br...@momjian.us writes: Tom Lane wrote: I think this is basically a large-caliber foot gun. You're going to pretend that invalid data is valid, until the user gets around to fixing it? What choice do we have? Create a fake data type, just as you said before. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] explain refactoring v2
Here's an updated version of my patch from last night. http://archives.postgresql.org/message-id/603c8f070905292048y804d505wf701214e7b81f...@mail.gmail.com In addition to the changes mentioned there, this removes a completely unused argument from show_scan_qual() and two redundant ones from show_sort_keys(). I realize this may not be the most glamorous patch that any of you have read in the last year... ...Robert *** a/src/backend/commands/explain.c --- b/src/backend/commands/explain.c *** *** 63,78 static void explain_outNode(StringInfo str, static void show_plan_tlist(Plan *plan, StringInfo str, int indent, ExplainState *es); static void show_scan_qual(List *qual, const char *qlabel, ! int scanrelid, Plan *scan_plan, Plan *outer_plan, StringInfo str, int indent, ExplainState *es); static void show_upper_qual(List *qual, const char *qlabel, Plan *plan, StringInfo str, int indent, ExplainState *es); ! static void show_sort_keys(Plan *sortplan, int nkeys, AttrNumber *keycols, ! const char *qlabel, StringInfo str, int indent, ExplainState *es); static void show_sort_info(SortState *sortstate, StringInfo str, int indent, ExplainState *es); static const char *explain_get_index_name(Oid indexId); /* --- 63,78 static void show_plan_tlist(Plan *plan, StringInfo str, int indent, ExplainState *es); static void show_scan_qual(List *qual, const char *qlabel, ! Plan *scan_plan, Plan *outer_plan, StringInfo str, int indent, ExplainState *es); static void show_upper_qual(List *qual, const char *qlabel, Plan *plan, StringInfo str, int indent, ExplainState *es); ! static void show_sort_keys(Plan *sortplan, const char *qlabel, StringInfo str, int indent, ExplainState *es); static void show_sort_info(SortState *sortstate, StringInfo str, int indent, ExplainState *es); static const char *explain_get_index_name(Oid indexId); + static void ExplainScanTarget(StringInfo str, Scan *plan, ExplainState *es); /* *** *** 668,790 explain_outNode(StringInfo str, case T_SeqScan: case T_BitmapHeapScan: case T_TidScan: - if (((Scan *) plan)-scanrelid 0) - { - RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid, - es-rtable); - char *relname; - - /* Assume it's on a real relation */ - Assert(rte-rtekind == RTE_RELATION); - - /* We only show the rel name, not schema name */ - relname = get_rel_name(rte-relid); - - appendStringInfo(str, on %s, - quote_identifier(relname)); - if (strcmp(rte-eref-aliasname, relname) != 0) - appendStringInfo(str, %s, - quote_identifier(rte-eref-aliasname)); - } - break; - case T_BitmapIndexScan: - appendStringInfo(str, on %s, - explain_get_index_name(((BitmapIndexScan *) plan)-indexid)); - break; case T_SubqueryScan: - if (((Scan *) plan)-scanrelid 0) - { - RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid, - es-rtable); - - appendStringInfo(str, %s, - quote_identifier(rte-eref-aliasname)); - } - break; case T_FunctionScan: - if (((Scan *) plan)-scanrelid 0) - { - RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid, - es-rtable); - Node *funcexpr; - char *proname; - - /* Assert it's on a RangeFunction */ - Assert(rte-rtekind == RTE_FUNCTION); - - /* - * If the expression is still a function call, we can get the - * real name of the function. Otherwise, punt (this can - * happen if the optimizer simplified away the function call, - * for example). - */ - funcexpr = ((FunctionScan *) plan)-funcexpr; - if (funcexpr IsA(funcexpr, FuncExpr)) - { - Oid funcid = ((FuncExpr *) funcexpr)-funcid; - - /* We only show the func name, not schema name */ - proname = get_func_name(funcid); - } - else - proname = rte-eref-aliasname; - - appendStringInfo(str, on %s, - quote_identifier(proname)); - if (strcmp(rte-eref-aliasname, proname) != 0) - appendStringInfo(str, %s, - quote_identifier(rte-eref-aliasname)); - } - break; case T_ValuesScan: - if (((Scan *) plan)-scanrelid 0) - { - RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid, - es-rtable); - char *valsname; - - /* Assert it's on a values rte */ - Assert(rte-rtekind == RTE_VALUES); - - valsname = rte-eref-aliasname; - - appendStringInfo(str, on %s, - quote_identifier(valsname)); - } - break; case T_CteScan: - if (((Scan *) plan)-scanrelid 0) - { - RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid, - es-rtable); - - /* Assert it's on a non-self-reference CTE */ - Assert(rte-rtekind == RTE_CTE); - Assert(!rte-self_reference); - -
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
On Sat, May 30, 2009 at 6:23 PM, Bruce Momjian br...@momjian.us wrote: I think this is basically a large-caliber foot gun. You're going to pretend that invalid data is valid, until the user gets around to fixing it? What choice do we have? Well you can store the data in a new fake data type (or even just mark the column as a bytea -- since any varlena is as good as any other). Then provide this conversion function to create the new data. I suppose that means you should drop the indexes since if you leave them things could get weird. But doing the conversion would have to rebuild indexes anyways so CREATE INDEX should run in the same time as that step of the conversion would have taken. It would be nice if you could leave them around so the conversion would rebuild them automatically, but that would require creating operators and an opclass for the fake data type which would be more of a pain than just marking the column as a bytea or a data type with no operators. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ruby connect
I'm trying to connect ruby to postgres on ubuntu and the only link I found that has the library is down. Does anyone have the postgres library for ruby? Or direct me to it? Justin Darby Carrera Database Programmer/Analyst MED-Department of Medical Social Sciences j-carr...@northwestern.edumailto:j-carr...@northwestern.edu 312-695-8994
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Tom Lane wrote: I think this is basically a large-caliber foot gun. You're going to pretend that invalid data is valid, until the user gets around to fixing it? What choice do we have? Create a fake data type, just as you said before. But I have to create fake index infastructure, and I have to then modify every reference to that data time at the end, even references in rules and expression indexes --- that seems too complex. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
Greg Stark wrote: On Sat, May 30, 2009 at 1:11 PM, Bruce Momjian br...@momjian.us wrote: I have discovered a simpler solution using ALTER TABLE and calling a conversion function: ? ? ? ?test= CREATE TABLE tsvector_test(x tsvector); ? ? ? ?CREATE TABLE ? ? ? ?test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector ? ? ? ?test- USING conversion_func(x); ? ? ? ?ALTER TABLE No need for a fake data type and the required index infrastructure. I assume you're putting this in the list of commands to run post-migration along with any reindex commands etc? Because it will take a while (still faster than dump/reload i think). Yes, current CVS now does this. I added logic so if you rebuilt the table, the index will not be rebuilt again because it is a GIN index. For this case, assuming the new tsvector's output function doesn't get confused by the old ordering, I think you can just use USING x::text::tsvector as your conversion expression. For more complex cases you might need to package up the old output function. Wow, cool idea. I will test that. Also note that you'll want to do any other conversions in the same table at the same time rather than doing multiple conversions. Wow, I didn't know we supported that too: ALTER TABLE t ALTER COLUMN x TYPE tsvector USING x, ALTER COLUMN y TYPE tsvector USING y; I will work on that also. Also, one gotcha to note is that tsvector data can appear inside composite data types or arrays. I don't think that's common so perhaps just a warning in the readme would suffice, but it's something to note at least. Wow, not sure how I would even find that. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type
Greg Stark wrote: On Sat, May 30, 2009 at 6:23 PM, Bruce Momjian br...@momjian.us wrote: I think this is basically a large-caliber foot gun. ?You're going to pretend that invalid data is valid, until the user gets around to fixing it? What choice do we have? Well you can store the data in a new fake data type (or even just mark the column as a bytea -- since any varlena is as good as any other). Then provide this conversion function to create the new data. I suppose that means you should drop the indexes since if you leave them things could get weird. But doing the conversion would have to rebuild indexes anyways so CREATE INDEX should run in the same time as that step of the conversion would have taken. It would be nice if you could leave them around so the conversion would rebuild them automatically, but that would require creating operators and an opclass for the fake data type which would be more of a pain than just marking the column as a bytea or a data type with no operators. The way the restore works, you would have to call the data type 'tsvector', and then that oid is linked around in all sort of odd places and you need to change them all, hence confusion. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers