[HACKERS] tsearch in core patch

2007-06-21 Thread Teodor Sigaev
nd CREATE/ALTER/DROP DICTIONARY TEMPLATE Which way do we choose? or I miss some variant? I would like to go by 3) way... Comments? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev

Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-15 Thread Teodor Sigaev
To support this sanely though wouldn't you need to know which language rule a tsvector was generated with? Like, have a byte in the tsvector tagging it with the language rule forever more? No. As corner case, dictionary might return just a number or a hash value. What I'm wondering about is

Re: [HACKERS] Rethinking user-defined-typmod before it's too late

2007-06-15 Thread Teodor Sigaev
rth agree it. I'm inclined to make the code in parse_type.c take either integer And modify ArrayGetTypmods() to ArrayGetIntegerTypmods() Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http:

Re: [HACKERS] Rethinking user-defined-typmod before it's too late

2007-06-15 Thread Teodor Sigaev
Is it worth providing an ArrayGetStringTypmods in core, when it won't be used by any existing core datatypes? I don't think so - cstring[] is a set of strings itself. I don't believe that we could suggest something commonly useful without some real-world examples. -

Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-15 Thread Teodor Sigaev
So, added to my plan (http://archives.postgresql.org/pgsql-hackers/2007-06/msg00618.php) n) single encoded files. That will touch snowball, ispell, synonym, thesaurus and simple dictionaries n+1) use encoding names instead of locale's names in configuration Tom Lane wrote: Teodor S

Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-15 Thread Teodor Sigaev
think this might be wrong? No. I believe that pgsql doesn't support encoding that can not be recoded from UTF8, at least for non-hieroglyph languages. -- Teodor Sigaev E-mail: [EMAIL PROTECTED]

Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-15 Thread Teodor Sigaev
r and stop-file should be used. For russian language with utf8 encoding it should use for lword english stemmer, but for italian language - italian stemmer. Any ASCII chars can't present in russian word, but might italian word can contains only ASCII. -- Teodor Sigaev

Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-15 Thread Teodor Sigaev
self, the same is to compatibility between two dictionaries, list of dictionaries. In practice, we didn't see any disasters after changes in configuration - until reindexing search becomes less punctual. -- Teodor Sigaev

Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-15 Thread Teodor Sigaev
st WHERE to_tsvector(mytextcol) @@ query. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 1: if posting/reading

Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-15 Thread Teodor Sigaev
. But it's incompatible changes and cause some difficulties for DBA. If server locale is ISO (or KOI8 or any other) and file is in UTF8 then text editor/tools might be confused. -- Teodor Sigaev E-mail: [EMAIL

Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-15 Thread Teodor Sigaev
I'd suggest allowing either full names ("swedish") or the standard two-letter abbreviations ("sv"). But let's stay away from locale names. We can use database's encoding name (the same names used in initdb -E) -- Teodor Sigaev

Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-15 Thread Teodor Sigaev
. Where will index store index creation GUC? So we create a pg_catalog full text configuration named UTF8.en-US, and some others like ru_RU.UTF-8. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW:

Re: [HACKERS] Tsearch vs Snowball, or what's a source file?

2007-06-15 Thread Teodor Sigaev
DICTIONARY TEMPLATE Which way do we choose? or I miss some variant? I would like to go by 3) way... Comments? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/

Re: [HACKERS] Tsearch vs Snowball, or what's a source file?

2007-06-15 Thread Teodor Sigaev
danish, dutch, finnish, french, german, hungarian, italian, norwegian, portuguese, spanish, swedish, russin and english Albe Laurenz wrote: Tom Lane wrote: Teodor Sigaev <[EMAIL PROTECTED]> writes: So, it's needed to change dictinitoption format of snowball dictionaries to poin

Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-15 Thread Teodor Sigaev
) simplifies life a lot in most cases. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 5: don't forget to increase

Re: [HACKERS] Tsearch vs Snowball, or what's a source file?

2007-06-15 Thread Teodor Sigaev
h completing the dictionary support functions to go with this infrastructure. How will we synchronize our changes in patch? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/

Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev
It should be. Instances of ispell (and synonym, thesaurus) dictionaries are different only in dict_initoption part, so it will be only one entry in pg_ts_dict_template and several ones in pg_ts_dict. No, I was thinking of still having just one pg_ts_dict catalog (no template) but removing its d

Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev
french, german, hungarian, italian, norwegian, portuguese, spanish, swedish, russin and english languages. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(e

Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev
But, there are reasons to allow users register new templates and in fact we know people/projects with application-dependent dictionaries. How they could dump/reload their dictionaries ? The same way as pg_am does. -- Teodor Sigaev E-mail: [EMAIL PROTECTED

Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev
Tom Lane wrote: Teodor Sigaev <[EMAIL PROTECTED]> writes: The reason to save SQLish interface to dictionaries is a simplicity of configuration. Snowball's stemmers are useful as is, but ispell dictionary requires some configuration action before using. Yeah. I had been wond

Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev
dable or cause backend crash. But usage of such tsvector column could be limited - not all words will be searchable. This sounds sort of analogous to the issues collation bring up. -- Teodor Sigaev E-mail: [

Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev
can we hard-code into the backend, and just update for every major release like we do for encodings? Sorry, no one of them :(. We know projects which introduce new parser, new dictionary. Config and map are changes very often. -- Teodor Sigaev E-mail: [EMAIL

Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev
complex task: experienced users could use several configuration simultaneously. For example: indexing use configuration which doesn't reject stop-words, but for default searching use configuration which rejects stop-words. BTW, the same effects may be produced by dictionary&#x

Re: [HACKERS] Tsearch vs Snowball, or what's a source file?

2007-06-09 Thread Teodor Sigaev
urce tree (in src/snowball) only three directory for snowball_code.tgz: - /compiler - compiler from *.sbl to *.c - /runtime - common code for all stemmers - /algorithms - *.sbl files and use pgsql's makefile infrastructure to compiling stemmers. Comments, objections? -- Teodor Sigaev

Re: [HACKERS] GiST intarray rd-tree indexes using intbig

2007-06-09 Thread Teodor Sigaev
I access the original intarray that is being referenced by this signature? Index doesn't store original int[] at all. From GiST support fuction there is no way to get access to table's value :(. -- Teodor Sigaev

Re: [HACKERS] GIN, XLogInsert and MarkBufferDirty

2007-06-05 Thread Teodor Sigaev
with XLogInsert. That's not safe, MarkBufferDirty needs to be called before XLogInsert to avoid a race condition in checkpoint, see comments in SyncOneBuffer in bufmgr.c for an explanation. Ugh, thank you fixed. It's a trace of misunderstood of WriteBuffer(). -- Teo

Re: [HACKERS] Tsearch vs Snowball, or what's a source file?

2007-06-04 Thread Teodor Sigaev
p://snowball.tartarus.org/dist/snowball_code.tgz tarball. 2 Snowball's compiling infrastructure doesn't support Windows target. I agree with simplify support process but, IMHO, it's much simpler to do it with C sources with pgsql's building infrastructure And where should it be

Re: [HACKERS] [pgsql-advocacy] Upcoming events

2007-06-04 Thread Teodor Sigaev
/2824.html -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please se

Re: [HACKERS] Patch queue triage

2007-05-02 Thread Teodor Sigaev
http://www.sigaev.ru/misc/tsearch_core-0.46.gz Patch is synced with current CVS HEAD and synced with bugfixes in contrib/tsearch2 -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru

[HACKERS] tsearch_core patch for inclusion

2007-03-29 Thread Teodor Sigaev
nd words ( German, Norwegian ). So, please, test it - we don't know that languages at all. 2 added recent fixes of contrib/tsearch2 3 fix usage of fopen/fclose -- Teodor Sigaev E-mail: [EMAIL PROTECTED]

Re: [HACKERS] Warning on contrib/tsearch2

2007-03-29 Thread Teodor Sigaev
on in tsearch_core patch I find the direct use of malloc/realloc/strdup to be poor style as well --- backend code that is not using palloc needs to have *very* good reason to do so, and I see none here. Already in tsearch_core patch. -- Teodor Sigaev E-mail: [

Re: [HACKERS] tsearch2 regression test failures

2007-03-26 Thread Teodor Sigaev
r more than two bytes... It doesn't significant matter - file reads once per backend lifetime. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/

Re: [HACKERS] tsearch2 regression test failures

2007-03-26 Thread Teodor Sigaev
FWIW, it looks like it failed to reject stopwords. Is it possible you Right. I suppose the problem is with '\r\n'... Try attached patch. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www

[HACKERS] tsearch_core for inclusion

2007-03-23 Thread Teodor Sigaev
y if it's worth to make user-created fts configuration to be visible prior to system configurations in pg_catalog, if pg_catalog was not *explicitly* specified in the search_path ? -- Teodor Sigaev E-mail: [EMAIL PROTECTED]

[HACKERS] alter operator class

2007-03-19 Thread Teodor Sigaev
t I'm afraid that will be often for fulltext configurations. How can we avoid such situations? Forbid changes on built-in objects? 'alter operator class .. owner to ...' doesn't dump too. -- Teodor Sigaev E-mail: [EMAIL PROTECT

Re: [HACKERS] Indexam interface proposal

2007-03-19 Thread Teodor Sigaev
. (In other words, it's an optimisation rather than a big change). I like your suggestion - it's useful for GiST/GIN fulltext indexing. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www

Re: [HACKERS] Indexam interface proposal

2007-03-19 Thread Teodor Sigaev
mn on which index was created, so gistgettuple can not return tuple in original form - but sometimes gistgettuple may be sure that recheck isn't needed. That would completely replace the current RECHECK-option we have, right? Yeah, this is possible. -- Teo

Re: [HACKERS] Indexam interface proposal

2007-03-19 Thread Teodor Sigaev
stored in index as is, but large value is compressed with lossy techniques. So, GiST might return a tuple which is allowed to not recheck. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Teodor Sigaev
at? New opclass layout, new opfamily table - users don't that changes at all. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)-

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Teodor Sigaev
l API. Putting tsearch in core discards a lot of such problem. For example, who notices changes in pg_am table from release to release? Really it was a developers/hackers, not a usual users -- Teodor Sigaev E-mail: [EMAIL PROT

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Teodor Sigaev
SING FULLTEXT (textcolumn) -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings

[HACKERS] tsearch_core for inclusion

2007-03-15 Thread Teodor Sigaev
OP FULLTEXT MAPPING ON pg FOR email, url, sfloat, uri, float; end; Patch is http://www.sigaev.ru/misc/tsearch_core-0.38.1.gz Comparing that syntaxes with current tsearch2 is placed at http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-syntax-compare.html So, which is

Re: [HACKERS] need help in understanding gist function

2007-03-14 Thread Teodor Sigaev
left and right buffers? Real write are produced by bgwriter process, in backend we should just mark byffer as dirty with a help of MarkBufferDirty call. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http:

Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-13 Thread Teodor Sigaev
e you searched our list archives? http://archives.postgresql.org -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broa

Re: [HACKERS] GIST and TOAST

2007-03-07 Thread Teodor Sigaev
I'm already started, don't worry about that. Cube is broken since TOAST implemented :) Gregory Stark wrote: "Teodor Sigaev" <[EMAIL PROTECTED]> writes: input value. As I remember, only R-Tree emulation over boxes, contrib/seg and contrib/cube have simple compress

Re: [HACKERS] GIST and TOAST

2007-03-06 Thread Teodor Sigaev
r has been toasted. Should I commit it or you'll include in your patch? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ *** ./contrib/intarray.orig/./_int_gist.c Tue Mar

Re: [HACKERS] GIST and TOAST

2007-03-06 Thread Teodor Sigaev
is three similar functions/macros: gistentryinit, gistcentryinit and gistdentryinit :) That names was choosen by authors initially developed GiST in pgsql. Well if you're doing everything in short-lived memory contexts then we don't even need this. Sure -- Teodor Sigaev

Re: [HACKERS] GIST and TOAST

2007-03-06 Thread Teodor Sigaev
tum came from originally and how it ended up stored in packed format. Can you provide your patch (in current state) and test suite? Or backtrace at least. -- Teodor Sigaev E-mail: [EMAIL PROTECTED]

Re: [HACKERS] GIST and TOAST

2007-03-06 Thread Teodor Sigaev
sing DatumGetPointer and PG_GETARG_POINTER and having to manually cast everywhere, no? It seems like there's a lot of extra pain to maintain the code in the present style with all the manual casts. Of course, I agree. Just PG_FREE_IF_COPY is extra call in support m

Re: [HACKERS] user-defined tree methods in GIST

2007-03-06 Thread Teodor Sigaev
level function. Try to play with SP-GiST (http://www.cs.purdue.edu/spgist/). SP-GiST is a modification of GiST for Space Partitioning Trees. But they patch will not work with 8.2 and up because of lack of concurrency. 8.2 doesn't support indexes without concurrency. -- Teo

Re: [HACKERS] GIST and TOAST

2007-03-06 Thread Teodor Sigaev
d PG_FREE_IF_COPY() GiST code works in separate memory context to prevent memory leaks. See gistinsert/gistbuildCallback/gistfindnext. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/

Re: [HACKERS] GIST and TOAST

2007-03-05 Thread Teodor Sigaev
index should be decompressed by GiST decompress support method. Another places: - compress might get original value in case of inserting new one, in all other cases it gets value produced by decompress method. - query in consistent method -- Teodor Sigaev E

Re: [HACKERS] What is CheckPoint.undo needed for?

2007-02-22 Thread Teodor Sigaev
Opps, sorry, I missed checkpoint keyword Teodor Sigaev wrote: What am I missing? Seems, it's about that http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php -- Teodor Sigaev E-mail: [EMAIL PROT

Re: [HACKERS] What is CheckPoint.undo needed for?

2007-02-22 Thread Teodor Sigaev
What am I missing? Seems, it's about that http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.siga

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Teodor Sigaev
_fulltext_mapping(cfgname, '{lexemetypename[, ...]}'::text[], '{dictname1[, ...]}'::text[]); Seems rather ugly for me... And function interface does not provide autocompletion and online help in psql. \df says only types of arguments, not a meaning. -- Teodor Sigaev

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Teodor Sigaev
IT init_function ] [ OPT opt_text ]; CREATE FULLTEXT DICTIONARY dictname [ { LEXIZE lexize_function | INIT init_function | OPT opt_text } [...] ] LIKE template_dictname; -- Teodor Sigaev E-mail: [EMAIL PROTECTED]

Re: [HACKERS] tsearch2: enable non ascii stop words with C locale

2007-02-13 Thread Teodor Sigaev
Precise definition for "latin" in C locale please. Are you saying that single byte encoding with range 0-7f? is "latin"? If so, it seems they are exacty same as ASCII. p_islatin returns true for ASCII alpha characters. -- Teodor Sigaev

Re: [HACKERS] tsearch2: enable non ascii stop words with C locale

2007-02-13 Thread Teodor Sigaev
te, leave unchanged en_stem for any latin word. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] GiST Comparing IndexTuples/Datums

2007-02-12 Thread Teodor Sigaev
GiST implementation supports only unordered trees (btree_gist is a some kind of emulation) and it cannot guarantee that equal keys will be close in index. That's related to picksplit and gistpenalty method problem/optimization and data set. -- Teodor Sigaev

Re: [HACKERS] tsearch2: enable non ascii stop words with C locale

2007-02-12 Thread Teodor Sigaev
with UTF8 take two bytes. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-09 Thread Teodor Sigaev
Implementing the "replace these TIDs" operation atomically would be simple, except for the new bitmap index am. It should be possible there That isn't simple (may be, even possible) from GIN. -- Teodor Sigaev E-mail: [

[HACKERS] Proposal for partial resove issue of GIN fullscan.

2007-01-30 Thread Teodor Sigaev
earch2) always means empty result and fast working of GIN, so, tsearch2's users will not face a error 'GIN index does not support search with void query' Comments, objections, suggestions? -- Teodor Sigaev E-mail: [EMAIL PROTECTED]

Re: [HACKERS] Proposal: allow installation of any contrib module

2007-01-25 Thread Teodor Sigaev
This might be a good idea, but it's hardly transparent; it can be counted on to break the applications of just about everyone using those modules today. Hmm, can we make separate schema for all contib modules and include it in default search_path? It will not touchs most users. -- T

Re: [HACKERS] Proposal: allow installation of any contrib module

2007-01-25 Thread Teodor Sigaev
sible. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Teodor Sigaev
found texts. GIST performance may be decreased too - GIST indexing of tsvector is lossy. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)---

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Teodor Sigaev
ands. Next, functions haven't autocomplete feature or built-in quick help - if you don't remember exactly kind/type of argument(s) of function then you should read a docs. -- Teodor Sigaev E-mail: [EMAIL PROTECTED]

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Teodor Sigaev
L's MATCH() AGAINST (txt IN BOOLEAN MODE) 2 it requires to keyword MATCH & AGAINST which cannot be a function's name without quoting. Internal API changed sometimes (not every release), but I don't see a problem here: all other internal API's in postgres are often ch

Re: [HACKERS] unused_oids?

2007-01-25 Thread Teodor Sigaev
8000 ) and before committing change they to lowest possible. Ehan HEAD is under hard development, oids change quickly, so you will need to rearrange your oids for each snapshot. -- Teodor Sigaev E-mail: [EMAIL PROTECTED]

[HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Teodor Sigaev
similar way as character conversation library does. If there aren't objections then we plan commit patch tomorrow or after tomorrow. Before committing, I'll changes oids from 5000+ to lower values to prevent holes in oids. And after that, I'll remove tsearch2 cont

Re: [HACKERS] regular expressions stranges

2007-01-23 Thread Teodor Sigaev
nction may be defined as: static int pg_wc_isalpha(pg_wchar c) { if ( (c >= 0 && c <= UCHAR_MAX) ) return isalpha((unsigned char) c) #ifdef HAVE_WCSTOMBS else if ( GetDatabaseEncoding() == PG_UTF8 ) return iswalpha((wint_t) c) #endif return

Re: [HACKERS] 10 weeks to feature freeze (Pending Work)

2007-01-23 Thread Teodor Sigaev
I would like to suggest patches for OR-clause optimization and using index for searching NULLs. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of

[HACKERS] regular expressions stranges

2007-01-23 Thread Teodor Sigaev
o tsearch2 locale part. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ set client_encoding='KOI8'; SELECT 'Ä' ~* '[[:alpha:]]' as "true"; SEL

Re: [HACKERS] Design notes for EquivalenceClasses

2007-01-18 Thread Teodor Sigaev
using Append node - without reintroducing multi-pass indexscan. Moreover, it allows to sort OR clauses to match sort order. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.siga

Re: [HACKERS] Design notes for EquivalenceClasses

2007-01-18 Thread Teodor Sigaev
Note that a bitmap scan or multi-pass indexscan (OR clause scan) has NIL pathkeys since we can say nothing about the overall order of its result. It's seems to me that multi-pass indexscan was removed after introducing bitmapscan. -- Teodor Sigaev E

Re: [HACKERS] Request for review: tsearch2 patch

2007-01-12 Thread Teodor Sigaev
u to test under Windows? Thank you. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ diff -c -r -N ../tsearch2.orig/ts_locale.c ./ts_locale.c *** ../tsearch2.orig/ts_locale.cFri Jan 1

Re: [HACKERS] Request for review: tsearch2 patch

2007-01-10 Thread Teodor Sigaev
TParser *prs) { ... ! if (lc_ctype_is_c()) ! { ! if (c > 0x7f) ! return 1; I have some some doubts that any character greater than 0x7f is an alpha symbol. Is it simple assumption or workaround? -- Teodor Sigaev

Re: [HACKERS] [PATCHES] Bundle of patches

2007-01-10 Thread Teodor Sigaev
Nice, thanks a lot. Tom Lane wrote: Teodor Sigaev <[EMAIL PROTECTED]> writes: Just a freshing for clean applying.. http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz Applied with some revisions, and pg_dump support and regression tests added. regards, to

Re: [HACKERS] Request for review: tsearch2 patch

2007-01-10 Thread Teodor Sigaev
Sorry for delay, I was on holidays :) Did you test patch on Windows platform? Tatsuo Ishii wrote: I have tested with local-enabled environment and found a bug. Included is the new version of patches. Teodor, Oleg, what do you think about these patches? If ok, shall I commit to CVS head

Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-29 Thread Teodor Sigaev
This is not responding to my concern. What you presented was an > Sorry, I see your point now. Is that test enough? Or I should make more? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev

Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-29 Thread Teodor Sigaev
Just a freshing for clean applying.. http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz Is any objections to commit? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru

Re: [HACKERS] tsearch in core patch, for review

2006-12-21 Thread Teodor Sigaev
patch: http://www.sigaev.ru/misc/tsearch_core-0.27.gz http://www.sigaev.ru/misc/tsearch_core-0.28.gz new version, because of XML commit - old patch doesn't apply cleanly. -- Teodor Sigaev E-mail: [EMAIL PROT

Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-21 Thread Teodor Sigaev
0.9 doesn't apply cleanly after Peter's changes, so, new version http://www.sigaev.ru/misc/user_defined_typmod-0.10.gz Teodor Sigaev wrote: >> Perhaps an array of int4 would be better? How much Done http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz The patch needs mor

[HACKERS] tsearch in core patch, for review

2006-12-20 Thread Teodor Sigaev
TTERN] list fulltext dictionaries (add "+" for more detail) \dFp [PATTERN] list fulltext parsers (add "+" for more detail) -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/

Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-20 Thread Teodor Sigaev
>> Perhaps an array of int4 would be better? How much Done http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz The patch needs more cleanup before applying, too, eg make comments match code, get rid of unused keywords added to gram.y. Cleaned. -- Teodor

Re: [HACKERS] pg_am.amstrategies should be 0 when not meaningful?

2006-12-18 Thread Teodor Sigaev
gards, tom lane -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ?

Re: [HACKERS] Operator class group proposal

2006-12-14 Thread Teodor Sigaev
different storage types. > I don't have any idea whether opclass groups would be useful for GiST or > GIN indexes, but maybe Oleg and Teodor can think of applications. I'm afraid it isn't useful for GiST/GIN - strategies are not defined at all and planner can't known

Re: [HACKERS] Proposal: syntax of operation with tsearch's configuration

2006-11-17 Thread Teodor Sigaev
Hmm, IMHO, it's needed for consistent interface: nobody adds new column to table by editing pg_class & pg_attribute, nobody looks for description of table by selection values from system table. Tom Lane wrote: Teodor Sigaev <[EMAIL PROTECTED]> writes: Now we (Oleg and me)

[HACKERS] Proposal: syntax of operation with tsearch's configuration

2006-11-17 Thread Teodor Sigaev
dF PATTERN - describe configuration with used parser and lexeme's mapping \dFd- list of dictionaries \dFd PATTERN - describe dictionary \dFp- parser's list \dFp PATETRN- describe parser -- Teodor Sigaev E-ma

Re: [HACKERS] string_to_array eats too much memory?

2006-11-08 Thread Teodor Sigaev
Limitations Sorry for noise - it's mentioned in README.tsearch2 -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--

Re: [HACKERS] string_to_array eats too much memory?

2006-11-08 Thread Teodor Sigaev
tor; tsvector -- 'wow:' (1 row) ':' is separator of lexeme and its position information -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end

Re: [HACKERS] string_to_array eats too much memory?

2006-11-08 Thread Teodor Sigaev
is limit will be replaced by 16383. 13.4.2 Only 256 positional info per lexem. Some useful articles http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/HOWTO-parser-tsearch2.html http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dic

Re: [HACKERS] string_to_array eats too much memory?

2006-11-08 Thread Teodor Sigaev
k words in a document? I don't see any problem. tsvector size should not be greater than 1Mb however. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(e

Re: [HACKERS] string_to_array eats too much memory?

2006-11-08 Thread Teodor Sigaev
for speedup searches, linguistic part is still in tsearch2. It's possible to use tsearch2 without any indexes at all. GiST and GIN is a way to speedup searches. Of course, you can develop another framework for full text search and framework may use GIN as it wish

Re: [HACKERS] Tsearch Index Size and GiST vs. GIN

2006-11-08 Thread Teodor Sigaev
nd of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Teodor Sigaev E-mail: [

Re: [HACKERS] string_to_array eats too much memory?

2006-11-08 Thread Teodor Sigaev
g_to_array seems to eat several Gig bytes of memory. ~70k array elements means there are same number of words in a document which is not too big in a large text IMO. Do you mean 70k unique lexemes? Ugh. Why do not you use tsearch framework? -- Teodor Sigaev E-ma

Re: [HACKERS] Index ignored with "is not distinct from", 8.2 beta2

2006-11-08 Thread Teodor Sigaev
There's been work on it. Theodor cleaned it up for HEAD and looked at adding GiST support. I beleive he's waiting for 8.2 to release. Yep, I have bundle of patches and I'm waiting for 8.2 branch split out of HEAD. -- Teodor Sigaev E-mail: [

Re: [HACKERS] [GENERAL] Index greater than 8k

2006-11-01 Thread Teodor Sigaev
es: Time of search in GIN weak depend on number of words (opposite to tsearch2/GiST), but insertion of row may be slow enough -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/

Re: [HACKERS] [GENERAL] Index greater than 8k

2006-10-31 Thread Teodor Sigaev
The problem as I remember it is pg_tgrm not tsearch2 directly, I've sent a self contained test case directly to Teodor which shows the error. 'ERROR: index row requires 8792 bytes, maximum size is 8191' Uh, I see. But I'm really surprised why do you use pg_trgm on b

Re: [HACKERS] [GENERAL] Index greater than 8k

2006-10-30 Thread Teodor Sigaev
om build? Can you send exact error message? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 7: You can h

Re: [HACKERS] Compiling with GIST

2006-10-26 Thread Teodor Sigaev
'psql DB < btree_gist.sql'. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 1: if posting/reading thro

<    1   2   3   4   5   6   7   8   9   >