Fwd: [GENERAL] Getting several columns from subselect with LIMIT 1
Thanks for the suggestion. Unfortunately I get an INVALID COLUMN REFERENCE (SQL state: 42P10) to the effect that the subselect in the FROM clause cannot reference other tables at the same request level. 2008/9/20 Marcus Engene <[EMAIL PROTECTED]>: >> >> SELECT >>persons.*, >>( >>SELECT child.name >>FROM progeny JOIN persons child ON child.id = progeny.child >>WHERE progeny.parent = persons.id >>ORDER BY child.birthdate ASC >>LIMIT 1 >>) AS firstborn_name >> FROM persons; >> >> Now, this is probably not the most elegant piece of code, but the real >> problem is that >> I cannot see how to extend it to the case where I want not only the >> firstborn's name but also the firstborn's ID >> (short of repeating the entire subselect a second time). At the >> moment, with this current syntax, my subSELECT statement would not be >> allowed to return more than a single column. > > Would this work? > > select > p.* > ,pp.* > from > persons p > ,( > SELECT child.name, child.id > FROM progeny JOIN persons child ON child.id = progeny.child > WHERE progeny.parent = p.id > ORDER BY child.birthdate ASC > LIMIT 1 > ) as kid(kid_name,kid_id) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Getting several columns from subselect with LIMIT 1
I have a PERSONS table. I also have a PROGENY table, which is a many-to-many association table with two foreign keys to the PERSONS table to itself. (In this day and age, not only can an individual have any number of children, but also a person can have any number of parents! At least, let's assume that's true for the sake of this setup.) Suppose I wish to construct a view of the persons, along with the name of their first-born (if they have one; NULL otherwise). The following SELECT does just that: SELECT persons.*, ( SELECT child.name FROM progeny JOIN persons child ON child.id = progeny.child WHERE progeny.parent = persons.id ORDER BY child.birthdate ASC LIMIT 1 ) AS firstborn_name FROM persons; Now, this is probably not the most elegant piece of code, but the real problem is that I cannot see how to extend it to the case where I want not only the firstborn's name but also the firstborn's ID (short of repeating the entire subselect a second time). At the moment, with this current syntax, my subSELECT statement would not be allowed to return more than a single column. Any suggestion? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] tsearch strategy for incremental search
2008/6/30 Oleg Bartunov <[EMAIL PROTECTED]>: > you, probably, can use custom configuration, which uses pg_3chars > dictionary. Thanks Oleg. That sounds like a cleverer (and more natural) idea than mine. I am intrigued: is "pg_3chars" something that's part of the current distribution of postgresql (I did a quick search and found nothing), or are you implying that it would be worth writing a tsearch configuration along the lines of my earlier ideas? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] tsearch strategy for incremental search
2008/6/30 Oleg Bartunov <[EMAIL PROTECTED]>: > tsearch will have prefix search support in 8.4. Thanks Oleg! That's fantastic news! In the meantime, carrying on with my earlier idea, here's the little function I came up with for extracting the 3-char-lexeme tsvector, in case anyone's interested: CREATE OR REPLACE FUNCTION public.to_three_char_tsvector(str text) RETURNS tsvector AS $BODY$declare somerow record; shortened_text text := ''; BEGIN FOR somerow IN SELECT regexp_split_to_table(trim(both '\'' from strip(to_tsvector('simple',str))::text),'\'') AS item LOOP shortened_text := shortened_text || ' ' || COALESCE(substring(somerow.item for 3), ''); END LOOP; RETURN strip(to_tsvector(shortened_text)); END;$BODY$ LANGUAGE 'plpgsql' IMMUTABLE; -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] tsearch strategy for incremental search
I am trying to implement an incremental search engine. The service should start searching when the user has typed at least 3 characters. I am thinking of using the following strategy: a) Create a function string_to_three_char_tsvector(str text) that would generate the tsvector composed of the three-letter lexemes that begin all the admissible words within the parameter str. b) Using this function, create an indexed tsvector column: three_char_index. c) Given the query string query_string (assume query_string containing at least 3 characters): SELECT * FROM mytable, plainto_tsquery((string_to_three_char_tsvector(query_string))::text) AS query WHERE three_char_index @@ query AND text_field LIKE '%' || str || '%'; Once I've narrowed the field of possibilities down to the correct 3-letter lexemes, there are fewer than 100 lines to search through with LIKE. I could even repeat the exercise with 4-letter lexemes if these numbers were to grow or if I needed the extra boost in performance. So, two questions to postgres/tsearch experts: 1) Does that seem like a decent overall strategy? 2) About the function string_to_three_char_tsvector(text), I cannot think of an elegant way of writing this. Is it possible to do better than the following: str => cast to tsvector => cast to text => for each lexeme-string, take first-three-char substring => concat back together => cast to tsvector Is there a nice way of performing the middle operation? Like splitting the string to an array... -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 8.3: where's the replacement tsearch2 module?
> > My database uses tsearch2. I was about to follow the conversions > > instructions found at Appendix F31 (on the new tsearch module). > > http://www.postgresql.org/docs/8.3/static/tsearch2.html > The docs will need to be updated because tsearch2 is now in the core and > should already be available, it's not an external module any more. Appendix F31 (which I quoted previously) indeed acknowledges that fact (that tsearch2 is in the core). The instructions I focused on explain how to "convert" an 8.2 database already using tsearch2, without going through the trouble of renaming the tsearch2 functions. Therefore, I think the docs are fine. The problem is the missing tsearch2.sql script... >You'd have to edit it to replace MODULE_PATHNAME with the right thing. >A bigger problem is that the tsearch2 dll is probably omitted too :-( Indeed, I cannot find tsearch2.dll anywhere... ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] 8.3: where's the replacement tsearch2 module?
I just downloaded the 8.3 Windows installation (binary with installer). My database uses tsearch2. I was about to follow the conversions instructions found at Appendix F31 (on the new tsearch module). http://www.postgresql.org/docs/8.3/static/tsearch2.html However, I hit a problem when I get to step 2: "In the new installation, create empty database(s) and install the replacement tsearch2 module into each database that will use text search." I am assuming that the "replacement tsearch2 module" is some file "tsearch2.sql" found in the folder share/contrib. However, no such file, or anything that looks remotely like it, in that folder or in any folder around. (If I download the sources, I can find a 16 Kb file entitled "tsearch2.sql.in" inside contrib/tsearch2. Is that the file I want?) For reference, here is the full content of the share/contrib folder, after the Windows installation: _int.sql adminpack.sql autoinc.sql btree_gist.sql chkpass.sql cube.sql dblink.sql earthdistance.sql fuzzystrmatch.sql hstore.sql insert_username.sql int_aggregate.sql isn.sql lo.sql ltree.sql moddatetime.sql pageinspect.sql pg_buffercache.sql pg_freespacemap.sql pg_trgm.sql pgcrypto.sql pgrowlocks.sql pgstattuple.sql pgxml.sql pldbgapi.sql refint.sql seg.sql sslinfo.sql tablefunc.sql timetravel.sql uninstall__int.sql uninstall_adminpack.sql uninstall_btree_gist.sql uninstall_chkpass.sql uninstall_cube.sql uninstall_dblink.sql uninstall_earthdistance.sql uninstall_fuzzystrmatch.sql uninstall_hstore.sql uninstall_int_aggregate.sql uninstall_isn.sql uninstall_lo.sql uninstall_ltree.sql uninstall_pageinspect.sql uninstall_pg_buffercache.sql uninstall_pg_freespacemap.sql uninstall_pg_trgm.sql uninstall_pgrowlocks.sql uninstall_pgstattuple.sql uninstall_pgxml.sql uninstall_seg.sql uninstall_sslinfo.sql uninstall_tablefunc.sql ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] varchar(n) VS text
Having read http://www.postgresql.org/docs/8.2/interactive/datatype-character.html I am puzzling over this issue: 1) Is there ever ANY reason to prefer "varchar(n)" to "text" as a column type? 2) For instance, if I know that a character-type column will never contain more than 300 characters, would I be better to define it as "varchar(300)" or as "text"? 3) What if, in practice that same column usually contains strings of no more than 10 characters (although I can't guarantee that it wouldn't explode up to 300 in the one-in-a-million case)? 4) On the other hand, what if a column systematically contains strings of 5 or fewer characters. Is it better to define it as "varchar(5)" or as "text"? From my reading of the dataype documentation, the ONLY reason I can think of for using "varchar(n)" would be in order to add an extra data-type constraint to the column. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] "simple" dict with stop words in tsearch2
Brilliant! Thank you! > In tsearch2, I would like to use the "simple" dictionary along with my > own list of stopwords. > [...] sure, just specify dict_initoption. For example, test=# update pg_ts_dict set dict_initoption='contrib/english.stop' where dict_name='simple'; UPDATE 1 test=# select lexize('simple', 'the'); lexize {} (1 row) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] "simple" dict with stop words in tsearch2
In tsearch2, I would like to use the "simple" dictionary along with my own list of stopwords. In other words, once the text is parsed into tokens, no stemming whatsoever, but stopwords are removed. Is there an easy way to produce that result, using the standard "simple" dictionary? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Fwd: [GENERAL] convert(USING utf8_to_iso_8859_15) on Windows
Thanks, Tom, for the comment. (Sorry for emailing directly to you: pressed "send" too quickly!) Although that raises further questions: * Is there a text that documents all that is known about the encoding issues between PostgreSQL and Windows? Surely, this is likely to be a "fairly" widespread issue! So far, everything I've read had to do with mysterious bad omens with never specific statements about what's what, and what can (or cannot) be done to solve the situation satisfactorily... * Windows XP does support UTF8, yet it is not possible (as far as I know) to define one's locale to have anything to do with UTF8 (presumably in the sense that UTF8 isn't an aspect of a specific locale): there is no en_US.UTF8 or fr_CA.UTF8 locales, for instance. But why should this matter? Say I am entering the data through a piece of software that works with UTF8, via the ODBC driver. Say again that I output the data with another software that expects UTF8, via the JDBC driver. Why does it matter that my system should be localized in another encoding? 2007/1/29, Tom Lane <[EMAIL PROTECTED]>: "Pierre Thibaudeau" <[EMAIL PROTECTED]> writes: > My setup is as follows: > PostgreSQL 8.2.1 on WindowsXP > The database has UTF8 encoding. > SHOW lc_ctype; gives: "French_Canada.1252" I'm not sure about any Windows-specific issues, but in general it's a really bad idea to be using lc_collate or lc_ctype that is incompatible with the database encoding. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] convert(USING utf8_to_iso_8859_15) on Windows
Is this a documented phenomenon with the "convert" function? The first result is what's expected: SELECT convert('Gregoire' USING utf8_to_iso_8859_15); "Gregoire" But I don't understand the next result, when I put an acute accent over the first "e": SELECT convert('Grégoire' USING utf8_to_iso_8859_15); "" (The output is an empty string.) Likewise, whenever I enter a string containing non-ASCII characters, the convert function outputs an empty string. Same results when I change the conversion type from UTF8 to any other encoding which accepts those non-ASCII characters... (When I try a conversion to an encoding that doesn't accept the characters, I get an error message, and that's normal.) My setup is as follows: PostgreSQL 8.2.1 on WindowsXP The database has UTF8 encoding. SHOW lc_ctype; gives: "French_Canada.1252" Is my problem related to Windows' lack of UTF8 support? I thought those problems were solved with version 8.2...
Re: [GENERAL] tsearch2, locale, UTF-8 and Windows
Just to pinpoint the meaning of my dismay, let me add one comment to my previous post. In the What'sNew document for tsearch2 with 8.2 http://www.sai.msu.su/~megera/wiki/Tsearch2WhatsNew we read: Don't forget to initdb cluster with correct utf8-locale ! initdb -D /usr/local/pgsql-dev/data.el_utf8 --locale=el_GR.utf8 I have never seen any detail of what was meant by the "correct" locale. Clearly if I was using a database with French content on a French system, I would intuitively choose fr_FR.utf8 as the locale, but if my database contains texts in several languages (see quoted post below), I don't know on what factor to base my choice of initdb locale. Suggestions? On Jan 28, 3:28 am, [EMAIL PROTECTED] ("Pierre Thibaudeau") wrote: [...] I cannot figure out what the latest status is concerning the "default locale" on a Windows UTF-8 database under PostgreSQL 8.2. [...] I have a UTF-8 database containing information in five different European languages (English, French, Spanish, German and Italian). I am coding on a Windows system with locale French_Canada.1252. The server that will soon run the database will likely have locale en_US...
[GENERAL] tsearch2, locale, UTF-8 and Windows
Over the past few days, I have been reading everything I could about tsearch2, but I cannot figure out what the latest status is concerning the "default locale" on a Windows UTF-8 database under PostgreSQL 8.2. More specifically, I have a UTF-8 database containing information in five different European languages (English, French, Spanish, German and Italian). I am coding on a Windows system with locale French_Canada.1252. The server that will soon run the database will likely have locale en_US... I am at a loss concerning the locale with which I should initdb on my system!!! What am I getting myself into? A lot of what I read on this matter was pretty bleak; has version 8.2 cleared all that? What is the latest word on this?