[HACKERS] dot to be considered as a word delimiter?

2009-05-30 Thread Sushant Sinha
Currently it seems like that dot is not considered as a word delimiter
by the english parser.

lawdb=# select to_tsvector('english', 'Mr.J.Sai Deepak');
   to_tsvector   
-
 'deepak':2 'mr.j.sai':1
(1 row)

So the word obtained is mr.j.sai rather than three words mr, j,
sai

It does it correctly if there is space in between, as space is
definitely a word delimiter.

lawdb=# select to_tsvector('english', 'Mr. J. Sai Deepak');
   to_tsvector   
-
 'j':2 'mr':1 'sai':3 'deepak':4
(1 row)


I think that dot should be considered by as a word delimiter because
when dot is not followed by a space, most of the time it is an error in
typing. Beside they are not many valid english words that have dot in
between.

-Sushant.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path improvements WAS: search_path vs extensions

2009-05-30 Thread David E. Wheeler

On May 29, 2009, at 5:16 PM, Greg Stark wrote:

On Fri, May 29, 2009 at 11:03 PM, David E. Wheeler da...@kineticode.com 
 wrote:

On May 29, 2009, at 2:52 PM, Josh Berkus wrote:


a) the ability to push a schema onto the current search path
b) the ability to pull a schema off the current search path


push, pop, shift, unshift. :-)

Come to think of it, I want these for arrays, too. ;-)


push and unshift sure -- and you do have those for arrays, it's  
spelled ||.


Well, no, not quite, as push, pop, shift, and unshift change arrays in  
place, whereas || (and array_concat() and array_append() and  
array_prepend()) create and return a new array.



I'm not so sure about pop/shift though. How would you know the element
you want is at the beginning/end unless you just put it there?


Well, for arrays, I might use them in a function just as I currently  
use them for various things in Perl. For search_path, yeah, I'd use  
them for lexical scoping: unshift a path onto the search path at the  
beginning of the function and shift it off at the end. Or push it onto  
the end of the search path and pop it off at the end.



I think what you really want is to use SET LOCAL or RESET to restore
it to whatever it was before you started futzing with it. We might
need a more flexible way to do that that isn't tied to transactions
though.


Well, lexical scoping for the changes would help, for sure. And I  
think that RESET is fine, but it would be more useful to have ways to  
push and unshift, or even to splice (sometimes I might want to remove  
a schema from the search path no matter where it appears in the path).


Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Bruce Momjian
Bruce Momjian wrote:
 Tom Lane wrote:
  Josh Berkus j...@agliodbs.com writes:
   It would be nice to have pg_migrator handle this, especially if we could 
   do it in parallel.  Then we just have to warn users that migrating a 
   database with tsvector columns takes significantly longer.  That is,
  
   1) do rest of catalog swap and link/copy of objects.
   2) mark all tsvector columns as 83_tsvector and add new tsvector type
   (these columns will be unusable for queries)
   3) bring up database
   4) search for all 83_tsvector columns
   5) do ALTER TABLE on each of these columns, in parallel, up to a 
   configuration setting (default 3).
  
  pg_migrator is already emitting a script that is intended to be run
  after conversion, to handle REINDEXing of incompatible indexes.  That
  could easily be made to do ALTER TYPE on old tsvector columns too, no?
 
 Hmmm.  Well, the problem right now with v8_3_tsvector is that it fails
 on index create, even when the index is empty, because I can't figure
 out how to simply set up the proper index catalog entries.  Once that is
 fixed and I can bind tsvector to v8_3_tsvector on schema creation, I can
 easily emit ALTER TABLE to fix the issue.  And, at that point the
 tsvector GIN indexes would be automatically created so I can skip that
 part.

I have discovered a simpler solution using ALTER TABLE and calling a
conversion function:

test= CREATE TABLE tsvector_test(x tsvector);
CREATE TABLE
test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector 
test- USING conversion_func(x);
ALTER TABLE

No need for a fake data type and the required index infrastructure.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Greg Stark
On Sat, May 30, 2009 at 1:11 PM, Bruce Momjian br...@momjian.us wrote:
 I have discovered a simpler solution using ALTER TABLE and calling a
 conversion function:

        test= CREATE TABLE tsvector_test(x tsvector);
        CREATE TABLE
        test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector
        test- USING conversion_func(x);
        ALTER TABLE

 No need for a fake data type and the required index infrastructure.

I assume you're putting this in the list of commands to run
post-migration along with any reindex commands etc? Because it will
take a while (still faster than dump/reload i think).

For this case, assuming the new tsvector's output function doesn't get
confused by the old ordering, I think you can just use USING
x::text::tsvector as your conversion expression. For more complex
cases you might need to package up the old output function.

Also note that you'll want to do any other conversions in the same
table at the same time rather than doing multiple conversions.

Also, one gotcha to note is that tsvector data can appear inside
composite data types or arrays. I don't think that's common so perhaps
just a warning in the readme would suffice, but it's something to note
at least.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Dimitri Fontaine

Hi,

Le 30 mai 09 à 16:02, Greg Stark a écrit :
On Sat, May 30, 2009 at 1:11 PM, Bruce Momjian br...@momjian.us  
wrote:

I have discovered a simpler solution using ALTER TABLE and calling a
conversion function:

   test= CREATE TABLE tsvector_test(x tsvector);
   CREATE TABLE
   test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector
   test- USING conversion_func(x);
   ALTER TABLE

No need for a fake data type and the required index infrastructure.


I assume you're putting this in the list of commands to run
post-migration along with any reindex commands etc? Because it will
take a while (still faster than dump/reload i think).


Just thinking some more about the idea to get all those post- 
processing steps running in parallel, it's occurring to me that we  
have all we need already: would it be possible for pg_migrator to  
issue a schema only script with a catalog, in the custom archive format?
Then we could use pg_restore -j whatever post_migrator.script to run  
the last migration step.


Of course, people will want the custom script output of pg_migrator to  
be optional, I guess.


Regards,
--
dim
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 I have discovered a simpler solution using ALTER TABLE and calling a
 conversion function:

   test= CREATE TABLE tsvector_test(x tsvector);
   CREATE TABLE
   test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector 
   test- USING conversion_func(x);
   ALTER TABLE

 No need for a fake data type and the required index infrastructure.

I think this is basically a large-caliber foot gun.  You're going to
pretend that invalid data is valid, until the user gets around to fixing
it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean shutdown and warm standby

2009-05-30 Thread Simon Riggs

On Fri, 2009-05-29 at 21:16 -0300, Euler Taveira de Oliveira wrote:
 Simon Riggs escreveu:
  And for them, it hasn't been completely fixed. That point was not made
  by patch author or committer, leaving the impression it was now
  completely safe, which, I truly regret to say, is not correct.
  
 Simon, could you point out what the patch does not do? If we can't fix it now,
 at least we add it to TODO.

I already did, on my first post on this thread. I won't repeat myself,
so that we can avoid restarting the discussion; forgive me.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Bruce Momjian
Dimitri Fontaine wrote:
 Hi,
 
 Le 30 mai 09 ? 16:02, Greg Stark a ?crit :
  On Sat, May 30, 2009 at 1:11 PM, Bruce Momjian br...@momjian.us  
  wrote:
  I have discovered a simpler solution using ALTER TABLE and calling a
  conversion function:
 
 test= CREATE TABLE tsvector_test(x tsvector);
 CREATE TABLE
 test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector
 test- USING conversion_func(x);
 ALTER TABLE
 
  No need for a fake data type and the required index infrastructure.
 
  I assume you're putting this in the list of commands to run
  post-migration along with any reindex commands etc? Because it will
  take a while (still faster than dump/reload i think).

Yes.

 Just thinking some more about the idea to get all those post- 
 processing steps running in parallel, it's occurring to me that we  
 have all we need already: would it be possible for pg_migrator to  
 issue a schema only script with a catalog, in the custom archive format?
 Then we could use pg_restore -j whatever post_migrator.script to run  
 the last migration step.
 
 Of course, people will want the custom script output of pg_migrator to  
 be optional, I guess.

Yes, eventually that is possible.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  I have discovered a simpler solution using ALTER TABLE and calling a
  conversion function:
 
  test= CREATE TABLE tsvector_test(x tsvector);
  CREATE TABLE
  test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector 
  test- USING conversion_func(x);
  ALTER TABLE
 
  No need for a fake data type and the required index infrastructure.
 
 I think this is basically a large-caliber foot gun.  You're going to
 pretend that invalid data is valid, until the user gets around to fixing
 it?

What choice do we have?  While we can mark indexes as invalid (which we
do), how do we mark a table's contents as invalid?  Should we create
rules so no one can see the data and then have the ALTER TABLE script
remove the rules after it is rebuilt?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Bruce Momjian
Bruce Momjian wrote:
  Just thinking some more about the idea to get all those post- 
  processing steps running in parallel, it's occurring to me that we  
  have all we need already: would it be possible for pg_migrator to  
  issue a schema only script with a catalog, in the custom archive format?
  Then we could use pg_restore -j whatever post_migrator.script to run  
  the last migration step.
  
  Of course, people will want the custom script output of pg_migrator to  
  be optional, I guess.
 
 Yes, eventually that is possible.

I have added that as a pg_migrator TODO item.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Bruce Momjian
Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   I have discovered a simpler solution using ALTER TABLE and calling a
   conversion function:
  
 test= CREATE TABLE tsvector_test(x tsvector);
 CREATE TABLE
 test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector 
 test- USING conversion_func(x);
 ALTER TABLE
  
   No need for a fake data type and the required index infrastructure.
  
  I think this is basically a large-caliber foot gun.  You're going to

Uh, isn't pg_migrator itself a large-caliber foot gun?  No need to call
just part of it a foot gun.  ;-)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-30 Thread Bernd Helmle
--On Samstag, Mai 30, 2009 00:47:16 +0300 Hannu Krosing 
ha...@2ndquadrant.com wrote:



And we can also escape the need to uncompress TOAST'ed fields - just
markup the compression as another \c at the beginning of data.


Hmm i thought about that, but that seems only to make sense if there is an 
easy way to bypass compressing the data on restore. Also, it seems to me 
that compression/decompression isn't a real bottleneck, but that needs to 
be confirmed.


--
 Thanks

   Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-30 Thread Bernd Helmle
--On Freitag, Mai 29, 2009 11:06:28 +0300 Peter Eisentraut 
pete...@gmx.net wrote:




Btw., I have started to write some code for that.


Cool. Let me know if i can help out somewhere.

--
 Thanks

   Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean shutdown and warm standby

2009-05-30 Thread Euler Taveira de Oliveira
Simon Riggs escreveu:
 And for them, it hasn't been completely fixed. That point was not made
 by patch author or committer, leaving the impression it was now
 completely safe, which, I truly regret to say, is not correct.
 
Simon, could you point out what the patch does not do? If we can't fix it now,
at least we add it to TODO.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 I think this is basically a large-caliber foot gun.  You're going to
 pretend that invalid data is valid, until the user gets around to fixing
 it?

 What choice do we have?

Create a fake data type, just as you said before.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] explain refactoring v2

2009-05-30 Thread Robert Haas
Here's an updated version of my patch from last night.

http://archives.postgresql.org/message-id/603c8f070905292048y804d505wf701214e7b81f...@mail.gmail.com

In addition to the changes mentioned there, this removes a completely
unused argument from show_scan_qual() and two redundant ones from
show_sort_keys().

I realize this may not be the most glamorous patch that any of you
have read in the last year...

...Robert
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***
*** 63,78  static void explain_outNode(StringInfo str,
  static void show_plan_tlist(Plan *plan,
  			StringInfo str, int indent, ExplainState *es);
  static void show_scan_qual(List *qual, const char *qlabel,
! 			   int scanrelid, Plan *scan_plan, Plan *outer_plan,
  			   StringInfo str, int indent, ExplainState *es);
  static void show_upper_qual(List *qual, const char *qlabel, Plan *plan,
  StringInfo str, int indent, ExplainState *es);
! static void show_sort_keys(Plan *sortplan, int nkeys, AttrNumber *keycols,
! 			   const char *qlabel,
  			   StringInfo str, int indent, ExplainState *es);
  static void show_sort_info(SortState *sortstate,
  			   StringInfo str, int indent, ExplainState *es);
  static const char *explain_get_index_name(Oid indexId);
  
  
  /*
--- 63,78 
  static void show_plan_tlist(Plan *plan,
  			StringInfo str, int indent, ExplainState *es);
  static void show_scan_qual(List *qual, const char *qlabel,
! 			   Plan *scan_plan, Plan *outer_plan,
  			   StringInfo str, int indent, ExplainState *es);
  static void show_upper_qual(List *qual, const char *qlabel, Plan *plan,
  StringInfo str, int indent, ExplainState *es);
! static void show_sort_keys(Plan *sortplan, const char *qlabel,
  			   StringInfo str, int indent, ExplainState *es);
  static void show_sort_info(SortState *sortstate,
  			   StringInfo str, int indent, ExplainState *es);
  static const char *explain_get_index_name(Oid indexId);
+ static void ExplainScanTarget(StringInfo str, Scan *plan, ExplainState *es);
  
  
  /*
***
*** 668,790  explain_outNode(StringInfo str,
  		case T_SeqScan:
  		case T_BitmapHeapScan:
  		case T_TidScan:
- 			if (((Scan *) plan)-scanrelid  0)
- 			{
- RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid,
- 			  es-rtable);
- char	   *relname;
- 
- /* Assume it's on a real relation */
- Assert(rte-rtekind == RTE_RELATION);
- 
- /* We only show the rel name, not schema name */
- relname = get_rel_name(rte-relid);
- 
- appendStringInfo(str,  on %s,
-  quote_identifier(relname));
- if (strcmp(rte-eref-aliasname, relname) != 0)
- 	appendStringInfo(str,  %s,
- 	 quote_identifier(rte-eref-aliasname));
- 			}
- 			break;
- 		case T_BitmapIndexScan:
- 			appendStringInfo(str,  on %s,
- explain_get_index_name(((BitmapIndexScan *) plan)-indexid));
- 			break;
  		case T_SubqueryScan:
- 			if (((Scan *) plan)-scanrelid  0)
- 			{
- RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid,
- 			  es-rtable);
- 
- appendStringInfo(str,  %s,
-  quote_identifier(rte-eref-aliasname));
- 			}
- 			break;
  		case T_FunctionScan:
- 			if (((Scan *) plan)-scanrelid  0)
- 			{
- RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid,
- 			  es-rtable);
- Node	   *funcexpr;
- char	   *proname;
- 
- /* Assert it's on a RangeFunction */
- Assert(rte-rtekind == RTE_FUNCTION);
- 
- /*
-  * If the expression is still a function call, we can get the
-  * real name of the function.  Otherwise, punt (this can
-  * happen if the optimizer simplified away the function call,
-  * for example).
-  */
- funcexpr = ((FunctionScan *) plan)-funcexpr;
- if (funcexpr  IsA(funcexpr, FuncExpr))
- {
- 	Oid			funcid = ((FuncExpr *) funcexpr)-funcid;
- 
- 	/* We only show the func name, not schema name */
- 	proname = get_func_name(funcid);
- }
- else
- 	proname = rte-eref-aliasname;
- 
- appendStringInfo(str,  on %s,
-  quote_identifier(proname));
- if (strcmp(rte-eref-aliasname, proname) != 0)
- 	appendStringInfo(str,  %s,
- 	 quote_identifier(rte-eref-aliasname));
- 			}
- 			break;
  		case T_ValuesScan:
- 			if (((Scan *) plan)-scanrelid  0)
- 			{
- RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid,
- 			  es-rtable);
- char	   *valsname;
- 
- /* Assert it's on a values rte */
- Assert(rte-rtekind == RTE_VALUES);
- 
- valsname = rte-eref-aliasname;
- 
- appendStringInfo(str,  on %s,
-  quote_identifier(valsname));
- 			}
- 			break;
  		case T_CteScan:
- 			if (((Scan *) plan)-scanrelid  0)
- 			{
- RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid,
- 			  es-rtable);
- 
- /* Assert it's on a non-self-reference CTE */
- Assert(rte-rtekind == RTE_CTE);
- Assert(!rte-self_reference);
- 
- 

Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Greg Stark
On Sat, May 30, 2009 at 6:23 PM, Bruce Momjian br...@momjian.us wrote:
 I think this is basically a large-caliber foot gun.  You're going to
 pretend that invalid data is valid, until the user gets around to fixing
 it?

 What choice do we have?

Well you can store the data in a new fake data type (or even just mark
the column as a bytea -- since any varlena is as good as any other).
Then provide this conversion function to create the new data.

I suppose that means you should drop the indexes since if you leave
them things could get weird. But doing the conversion would have to
rebuild indexes anyways so CREATE INDEX should run in the same time as
that step of the conversion would have taken. It would be nice if you
could leave them around so the conversion would rebuild them
automatically, but that would require creating operators and an
opclass for the fake data type which would be more of a pain than just
marking the column as a bytea or a data type with no operators.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ruby connect

2009-05-30 Thread Justin Carrera
I'm trying to connect ruby to postgres on ubuntu and the only link I found that 
has the library is down.  Does anyone have the postgres library for ruby?  Or 
direct me to it?

Justin Darby Carrera
Database Programmer/Analyst
MED-Department of Medical Social Sciences
j-carr...@northwestern.edumailto:j-carr...@northwestern.edu
312-695-8994



Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Tom Lane wrote:
  I think this is basically a large-caliber foot gun.  You're going to
  pretend that invalid data is valid, until the user gets around to fixing
  it?
 
  What choice do we have?
 
 Create a fake data type, just as you said before.

But I have to create fake index infastructure, and I have to then modify
every reference to that data time at the end, even references in rules
and expression indexes --- that seems too complex.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Bruce Momjian
Greg Stark wrote:
 On Sat, May 30, 2009 at 1:11 PM, Bruce Momjian br...@momjian.us wrote:
  I have discovered a simpler solution using ALTER TABLE and calling a
  conversion function:
 
  ? ? ? ?test= CREATE TABLE tsvector_test(x tsvector);
  ? ? ? ?CREATE TABLE
  ? ? ? ?test= ALTER TABLE tsvector_test ALTER COLUMN x TYPE tsvector
  ? ? ? ?test- USING conversion_func(x);
  ? ? ? ?ALTER TABLE
 
  No need for a fake data type and the required index infrastructure.
 
 I assume you're putting this in the list of commands to run
 post-migration along with any reindex commands etc? Because it will
 take a while (still faster than dump/reload i think).

Yes, current CVS now does this.  I added logic so if you rebuilt the
table, the index will not be rebuilt again because it is a GIN index.

 For this case, assuming the new tsvector's output function doesn't get
 confused by the old ordering, I think you can just use USING
 x::text::tsvector as your conversion expression. For more complex
 cases you might need to package up the old output function.

Wow, cool idea.  I will test that.

 Also note that you'll want to do any other conversions in the same
 table at the same time rather than doing multiple conversions.

Wow, I didn't know we supported that too:

 ALTER TABLE t ALTER COLUMN x TYPE tsvector USING x,  ALTER
COLUMN y TYPE tsvector USING y;

I will work on that also.

 Also, one gotcha to note is that tsvector data can appear inside
 composite data types or arrays. I don't think that's common so perhaps
 just a warning in the readme would suffice, but it's something to note
 at least.

Wow, not sure how I would even find that.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-30 Thread Bruce Momjian
Greg Stark wrote:
 On Sat, May 30, 2009 at 6:23 PM, Bruce Momjian br...@momjian.us wrote:
  I think this is basically a large-caliber foot gun. ?You're going to
  pretend that invalid data is valid, until the user gets around to fixing
  it?
 
  What choice do we have?
 
 Well you can store the data in a new fake data type (or even just mark
 the column as a bytea -- since any varlena is as good as any other).
 Then provide this conversion function to create the new data.
 
 I suppose that means you should drop the indexes since if you leave
 them things could get weird. But doing the conversion would have to
 rebuild indexes anyways so CREATE INDEX should run in the same time as
 that step of the conversion would have taken. It would be nice if you
 could leave them around so the conversion would rebuild them
 automatically, but that would require creating operators and an
 opclass for the fake data type which would be more of a pain than just
 marking the column as a bytea or a data type with no operators.

The way the restore works, you would have to call the data type
'tsvector', and then that oid is linked around in all sort of odd places
and you need to change them all, hence confusion.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers