from:"Janek Sendrowski"

Re: [GENERAL] How to delete completely duplicate rows

2014-01-02 Thread Janek Sendrowski

Thanks!
It's working
 
Janek


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] How to delete completely duplicate rows

2014-01-01 Thread Janek Sendrowski

Hi,
 
I want to delete duplicates in my table. I've dropped the unique constraint to 
insert my data.
My id value is a hash calculated witch the values of the two other columns.
So I want to delete all columns, which are indentical, but keeping one.
 
DELETE FROM table t1 USING table t2 WHERE t1.id = t2.id AND t1.ctid > t2.ctid

But the oids aren't unique enough.
What else could I do?

Janek


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cast hex to int in plpgsql

2013-12-31 Thread Janek Sendrowski

Thanks!
That's what I'm searching for.
 
Janek
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] cast hex to int in plpgsql

2013-12-31 Thread Janek Sendrowski

Hi,

How can I realize the line?
v_res := cast(x'v_tmp' as bigint);

v_tmp is a text variable with hex digits.

this works:
v_res := cast(x'6de14a8b478ac' as bigint);

I think its about quoting

Janek


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] bulk insert unique contraint

2013-12-30 Thread Janek Sendrowski

Hi Salah Jubeh,
 
I'm not in hurry ;)
Thanks for your answer. Thats what I've been searching for.
 
Janek Sendrwoski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] bulk insert unique contraint

2013-12-30 Thread Janek Sendrowski

Hi Salah Jubeh,
 
My files don't have this format. Could it make sense to create such a file with 
php and import the data from it?
Or ist also possible the get certain rows from the file with regexp?
 
Janek


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] bulk insert unique contraint

2013-12-30 Thread Janek Sendrowski

Hi,
 
I want to insert data in my table and I want to insert the rows, which don't 
violates the unique contraint of my id. I'm using a 64bit hash for my it.
If I do one insert statement, which inserts many rows it doesn't do anything if 
one row violates the unique contraint.
Is there a faster way than using multiple insert statements?
 
Janek


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [PERFORM] Similarity search with the tsearch2 extension

2013-12-06 Thread Janek Sendrowski

Sorry, I used AND-statements instead of OR-statement in the example.
I notices that gin is much faster than gist, but I don't know why.

The query gets slow, because there are many non-stop words which appear very 
often in my sentences, like in 3% of all the sentences.
Do you think it could be worth it to filter the words, which appears that often 
and declare them as stop-words.
How would you split a sentence with let's say 10 non stop words to provide a 
performed similarity search?
 
There's still the problem with very short sentences. An partiel index on them 
with the trigram search might be the solution.
The pg_trgm module is far to slow for bigger setences, like you showed.
 
I thought I'll build a few partiel indexes on the string length, to enhance the 
performance.
Do you know some more improvements?
 
Janek Sendrowki


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Similarity search for sentences

2013-12-06 Thread Janek Sendrowski

Hi,
thanks for your Answers.
 
@Rémi Cura
You suggest a kind of Full Text Search.  I already had a try with the tsearch2 
extension.
The issue is to realize the similarity search. I have to use many OR statements 
with a low set of arguments.
That significantly slows the FTS down.
 
@Kevin Grittner
I used my own trigger to store the tsvector of the sentences and I created a 
usual gist Index on them.
What kind of functional Index would you suggest. Like i already told Rémi, I 
have to to use many OR statements with a low set of arguments, which heavy 
damages the perfance.
Do you have a better idea?
I usually used a query like this:
 
The tiger is the largest cat species[http://en.wikipedia.org/wiki/Felidae], 
reaching a total body length of up to 3.3 m  and weighing up to 306 kg.
--
totsvector:
'3.3':16 '306':22 'bodi':11 'cat':6 'kg':23 'largest':5 'length':12 'm':17 
'reach':8 'speci':7 'tiger':2 'total':10 'weigh':19
(1 row)
 
SELECT * FROM tablename WHERE vector @@ to_tsquery('speci & tiger & total & 
weigh') AND vector @@ to_tsquery('largest & length & m & reach')  AND vector @@ 
to_tsquery('3.3 & 306 & bodi & cat & kg');

And thats very slow
 
I didn't know that the pg_trgm Module provides KNN search.
 
Janek Sendrowski
 
 
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Similarity search for sentences

2013-12-05 Thread Janek Sendrowski

Hi,
 
I have tables with millions of sentences. Each row contains a sentence. It is 
natural language and every language is possible, but the sentences of one table 
have the same language.
I have to do a similarity search on them. It has to be very fast, because I 
have to search for a few hundert sentences many times.
The search shouldn't be context-based. It should just get sentences with 
similar words(maybe stemmed).
 
I already had a try with gist/gin-index-based trigramm search (pg_trgm 
extension), fulltextsearch (tsearch2 extension) and a pivot-based indexing 
(Fixed Query Array), but it's all to slow or not suitable.
Soundex and Metaphone aren't suitable, as well.
 
I'm already working on this project since a long time, but without any success.
Do any of you have an idea?
 
I would be very thankful for help.
 
Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Searching for suiting index

2013-12-01 Thread Janek Sendrowski

Hi,
 
I store around 300 bits the row in my table. Every bit is a boolean values on 
its own.
I want to do a similarity search by comparing each bit and get every row which 
has a certain number of equal bits.
What kind of index could be useful for scenario?
Could it make sense to calculate a kind of hashvalue for each row to filter the 
rows dirty at first.
 
Thank you for your help
Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Documentation of C functions

2013-11-27 Thread Janek Sendrowski

Hi,
Is there a documentation of postgresql's C functions like SET_VARSIZE for 
exmaple?
 
Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Debugging of C functions

2013-11-25 Thread Janek Sendrowski

Hi,
 
How is it possible to debug user defined functions written in C.
I can't just include all the files and run it.
 
Does anyone have an idea?
 
Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] /usr/bin/ld: cannot find [...] when compiling

2013-11-24 Thread Janek Sendrowski

Now it's working.
My Makefile:

OBJS = test.o
MODULES = test
DATA = test.c

PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)

Is it possible to execute the code just like this?
I like to do that for Debugging.
I can't execute the .so files, of course and If I just do "gcc filename", I 
have the issue with the includes.

Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] /usr/bin/ld: cannot find [...] when compiling

2013-11-24 Thread Janek Sendrowski

The file test.c just inlcudes: #include "postgres.h".

root@ubuntu:/usr/include/postgresql/9.3/server# gcc test.c
In file included from postgres.h:48:0,
 from test.c:1:
utils/elog.h:69:28: fatal error: utils/errcodes.h: No such file or directory
compilation terminated.

If elog.h file only includes "errcodes.h" without /utils, it's working.
It's the same With all files I'm including.

Makefile: 
PROGRAM = test
DATA = test.c

PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)

root@ubuntu:/usr/include/postgresql/9.3/server# make
gcc -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat 
-Wformat-security -Werror=format-security -fPIC -pie -I/usr/include/mit-krb5 
-DLINUX_OOM_ADJ=0 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g   
-L/usr/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now -Wl,--as-needed 
-L/usr/lib/mit-krb5 -L/usr/lib/i386-linux-gnu/mit-krb5 
-L/usr/lib/i386-linux-gnu  -Wl,--as-needed  -lpgport -lpgcommon -lxslt -lxml2 
-lpam -lssl -lcrypto -lkrb5 -lcom_err -lgssapi_krb5 -lz -ledit -lcrypt -ldl -lm 
 -o test
/usr/bin/ld: cannot find -lxslt
/usr/bin/ld: cannot find -lxml2
/usr/bin/ld: cannot find -lpam
/usr/bin/ld: cannot find -ledit
collect2: ld returned 1 exit status
make: *** [test] Error 1

I don't know where to get these devel-packages.
Could I just reinstall the postgresql-server-dev-9.3 package?

Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] /usr/bin/ld: cannot find [...] when compiling

2013-11-24 Thread Janek Sendrowski

Hi,

Thanks for your answer.
Now the link editor is working, but I still have one problem.
The files in /utils for example include there files with #include 
"utils/filename" and it doesn't work, because the file which includes them is 
already in the directory /utils.
It's the same with all directories...
Sorry, if this question is to silly, but what do I wrong?

Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] /usr/bin/ld: cannot find [...] when compiling

2013-11-24 Thread Janek Sendrowski



Hi,

 

My Makefile looks like this:


PROGRAM = test
DATA = "">
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)

 


The file test.c only includes the postgres.h (#include "postgres.h"), but I get this error, when compiling:

/usr/bin/ld: cannot find -lxslt
/usr/bin/ld: cannot find -lxml2
/usr/bin/ld: cannot find -lpam
/usr/bin/ld: cannot find -ledit
collect2: ld returned 1 exit status
make: *** [test] Error 1

 

Does anyone have an idea?

 

Janek Sendrowski

Re: [GENERAL] include all the postgres libraries (C)

2013-11-22 Thread Janek Sendrowski


I think PGXS is, what I've been looking for.

 

My Makefile looks like this:


 PROGRAM = test
DATA = "">
 PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)

 


The file test.c only includes the postgres.h, but I get this error, when compiling:

/usr/bin/ld: cannot find -lxslt
/usr/bin/ld: cannot find -lxml2
/usr/bin/ld: cannot find -lpam
/usr/bin/ld: cannot find -ledit
collect2: ld returned 1 exit status
make: *** [test] Error 1

 

Does anyone have an idea?

 

Janek Sendrowski

[GENERAL] include all the postgres libraries (C)

2013-11-22 Thread Janek Sendrowski

Hi,
 
I like to try some things with C and I need certain postgres libraries for it.
This time, I'm including postgres.h for example, but postgres.h doesn't include 
it's files.
How do I include the whole tree, or is there a certain directory I can use.
I just want to compile and run the files for first.
 
Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Regex files are missing

2013-11-19 Thread Janek Sendrowski

Thanks for your help!
I updated to version 9.3 and now it's working.

Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Regex files are missing

2013-11-18 Thread Janek Sendrowski


Hi,

 

My current version is 9.2. I could just update it.

I got the pg_trgm from here:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=tree;f=contrib/pg_trgm;hb=refs/heads/master

 

And the regex files from here:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=tree;f=src/include/regex;h=f65341c80ea8516a1e02a8b57d9c27d3034eaba4;hb=refs/heads/master

 

Where could I get the matching version of pg_trgm?

 

Janek Sendrowksi

[GENERAL] Regex files are missing

2013-11-18 Thread Janek Sendrowski

Hi,
 
Have some issues to built the pg_trgm module from source.
For first the regexport.h file was missing in /usr/include, so I got it.
Now I still need the the regexport.c file and probably also the other one
You can see the files in this link:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=tree;f=src/backend/regex;h=a30f70549052ee6828616d933d72d0ea84922ba7;hb=refs/heads/master

But I don't know where to put them. I don't have a usr/backend directory in 
ubuntu.
Anyways it's very weird. Why don't I have these files?
Do I have to install something for example.
 
Hope there is someone who can help me
 
Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Install pg_trgm from source

2013-11-15 Thread Janek Sendrowski

Thanks a lot for your Answers.
 
That's what I was looking for.
I couln't find it.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Install pg_trgm from source

2013-11-15 Thread Janek Sendrowski

Hi,

 

I like to change the source code of the pg_trgm extension a little bit.

Where can I get the source code and how do I compile it?

 

I can only find the SQL-files of pg_trgm in my directories.

 

Janek Sendowksi

Re: [GENERAL] pg_trgm module: no convertion into Trigrams on one side when comparing

2013-11-15 Thread Janek Sendrowski

Hi Amit,
Do you think it would by difficult to edit the source and wirte a new function, 
wich does it?
 
Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_trgm module: no convertion into Trigrams on one side when comparing

2013-11-14 Thread Janek Sendrowski

Hi Amit,
 
Thanks for your answer.

My issue is that I still need a ranking like the similarity when comparing 
trigrams.
I'm working on a similarity search, which determindes similiar sentences from 
my table.

The usual similarity search with Trigrams and a gist bzw. gin index is to slow 
with my set of rows.

So I've built an Fixed Query Array. I took some sentences as pivots and 
calculated the distance between them and all my sentences.
Now I can search for similiar sentences by calculating their distance to my 
pivots and comparing them with the distances I already got.
I noticed that it's a good way to use trigrams as metric distance function.
 
So the issue is that I have to take certain trigrams as pivots and of cource I 
need to store an exact ranking when comparing.
Your method detect the rows which includes all of the trigrams you are 
searching for, but I still need a ranking...

something like SELECT similarity('How are your', "{" ho","how","are","re "," 
hi","wha","hat","at ","  a"}"::trgm);
 
I hope you have an idea
 
Janek Sendrowski
 
 
 
 
 
 
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] pg_trgm module: no convertion into Trigrams on one side when comparing

2013-11-13 Thread Janek Sendrowski

Hi,
I'm using the pg_trgm module,
Is there a possibility not to convert one side of the string when comparing?
I need a kind of reference string containing only certain trigrams.
It's not possible to build every kind of string using the function.

example:
Code: Alles auswählen
SELECT show_trgm('abc');
show_trgm
-
{" a"," ab",abc,"bc "}
(1 row)

But I like to have only {'abc'} for example.

I would use this function: "similarity(text, text)"
Could it somehow work with explicit data-type definitions or maybe with putting 
the string in brackets or quoting?
Or do I have to change the source code?

Thanks for your help

Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Suitable Index for my Table

2013-11-04 Thread Janek Sendrowski

Thanks for your Answers!
 
@Martijn van Oosterhout
 
I am always searching for single colums. The  values are different every time, 
because these are distances from my new sentence to my pivots.
Does the Index support a query with this WHERE statement: WHERE value BETWEEN 
(distance1 - radius) AND (distance1 + radius)?
 
@Kevin Grittner
okay, I will use arrays instead of multiple columns.
A working query looks like this:
SELECT id FROM distance WHERE
value BETWEEN (distance1 - radius) AND (distance1 + radius) AND
value BETWEEN (distance2 - radius) AND (distance2 + radius) AND
value BETWEEN (distance3 - radius) AND (distance3 + radius) AND
value BETWEEN (distance4 - radius) AND (distance4 + radius) AND
value BETWEEN (distance5 - radius) AND (distance5 + radius) AND
value BETWEEN (distance6 - radius) AND (distance6 + radius) AND
value BETWEEN (distance7 - radius) AND (distance7 + radius) AND
value BETWEEN (distance8 - radius) AND (distance8 + radius) AND
value BETWEEN (distance9 - radius) AND (distance9 + radius) AND
value BETWEEN (distance10 - radius) AND (distance10 + radius) AND
value BETWEEN (distance11 - radius) AND (distance11 + radius) AND
value BETWEEN (distance12 - radius) AND (distance12 + radius) AND
value BETWEEN (distance13 - radius) AND (distance13 + radius) AND
value BETWEEN (distance14 - radius) AND (distance14 + radius) AND
value BETWEEN (distance15 - radius) AND (distance15 + radius) AND
value BETWEEN (distance16 - radius) AND (distance16 + radius) AND
value BETWEEN (distance17 - radius) AND (distance17 + radius) AND
value BETWEEN (distance18 - radius) AND (distance18 + radius) AND
value BETWEEN (distance19 - radius) AND (distance19 + radius) AND
value BETWEEN (distance20 - radius) AND (distance20 + radius) AND
value BETWEEN (distance21 - radius) AND (distance22 + radius) AND
value BETWEEN (distance22 - radius) AND (distance23 + radius) AND
value BETWEEN (distance23 - radius) AND (distance24 + radius);
 
I'm not sure about the number of pivots yet. It can get bigger,
but this is a single query to determine the fitting sentences.
The query just returns the ID of the colum which is a foreign key to a sentence 
in another table.
The table distance contains the metric distance of every pivot to all of my 
sentences in the other table.
I haven't found yet the right pivots, but the distance should be between 0 and 
1 in most cases, but it's also likely that it's 0.
Until now It just does a Seq Scan, when I'm searching through the table 
'distances'. I can show your the Query Plan, if you want.
The number of rows which are resulting have a range of 0 until something like 
100 for the begining.
 
Hope these were enough informations
 
Thank you for your help
Janek Sendrowki


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Suitable Index for my Table

2013-11-04 Thread Janek Sendrowski

Hi,

I've got a table with many Values of the Type REAL.
These are my metric distances or my pivots to my sentences.
The table looks like this:

ID INTEGER, distance1 REAL, distance2 REAL, distance3 REAL, distance4 REAL, 
..., distance24 REAL

The range of the Value is in between 0 and 1. So it looks like this 0.196 or 
0.891

That my query

WHERE value BETWEEN (distance1 - radius) AND (distance1 + radius)
WHERE value BETWEEN (distance2 - radius) AND (distance2 + radius)
WHERE value BETWEEN (distance3 - radius) AND (distance3 + radius)
WHERE value BETWEEN (distance4 - radius) AND (distance4 + radius)
...

Now I'm searching for a suitable index.

Does any of you have an idea?

TX very much for any support!

Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Searching for appropriate pivots for my Fixed Query Array

2013-11-01 Thread Janek Sendrowski

Hi,

I am trying to build my database with a metric index. I am using a Fixed Query 
Array.
I have got a table with many sentences of natural languages. I'm using the Dice 
Coefficient for calculating the metric space.
I am going to take a certain number of sentences and use them as pivots by 
calculating their distance to all of my sentences.

The table of distances looks like this:

id INTEGER, pivot TEXT, distance1 REAL, distance2 REAL, distance3 REAL, ...  
Eine Query sieht so aus:


A query looks like this:

SELECT id FROM distance 
WHERE dice('sentence', 'pivot1') BETWEEN (distance1 - radius) AND (distance1 + 
radius) 
  AND dice('sentence', 'pivot2') BETWEEN (distance2 - radius) AND (distance2 + 
radius) 
  AND dice('sentence', 'pivot3') BETWEEN (distance3 - radius) AND (distance3 + 
radius) 
  ...


My issue is to find appropriate pivots now.
I cannot take all existing words, that would be too much.

Does any of you have an idea?

TX very much for any support!

Janek


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PSQL: argument exceeds the maximum length of 1024 bytes

2013-10-12 Thread Janek Sendrowski

I would say it's just a query with to many rows or something like that.
My query was: SELECT jaro('word', col) FROM table.
The tables has ca. 75000 rows. If I'm gonna use the WHERE condition and only 
select the first 1 rows, it's working.
So it could be something with the memory, but 1024 bytes?!
 
Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Levenshtein Distance with more than 255 characters

2013-09-06 Thread Janek Sendrowski

Do you know the destination. I cant find it.

[GENERAL] Levenshtein Distance with more than 255 characters

2013-09-05 Thread Janek Sendrowski


Hi,

 

I'm searching for an optimized Levenshtein Distance like Postgresql's. My problem is that I want to compare strings with a length over 255 characters.

Does anyone know a solution?

 

Janek Sendrowski

Re: [GENERAL] store multiple rows with the SELECT INTO statement

2013-09-03 Thread Janek Sendrowski

The links don't work.

I don't know why.

how just don't know how to insert the data of a record in a table

Re: [GENERAL] store multiple rows with the SELECT INTO statement

2013-09-03 Thread Janek Sendrowski

Thanks for the answers.

I just can't understabd why it's not possible to store multiple columns returning from a dynamic Select statement which is executet with EXECUTE into a temporary table.

If I'm gonna use the LOOP through the SELECT statement, how can insert the data from the record into the temp table?

 

Janek Sendrowski

[GENERAL] store multiple rows with the SELECT INTO statement

2013-09-01 Thread Janek Sendrowski


Hi,

 

Why is it only possible to store one row by a query which returns multiple rows using the SELECT INTO statement.

and

How can I do a Query on a record varialbe, somehow like this:

SELECT * FROM v_rec

 

Janek Sendrowski

Re: [GENERAL] how to use aggregate functions in this case

2013-08-26 Thread Janek Sendrowski


Hi,

 

thanks for all your answers.

I'll have a try with the contains operator and the intrange, but before I'd like to know if this would work:

 

CASE WHEN a >= 0 AND a < 25

CASE WHEN a >= 25 AND a < 50

 

There wouldn't be a double endpoint. I just have to decide which range the endpoint includes.

 

Janek Sendrowski

Re: [GENERAL] how to use aggregate functions in this case

2013-08-25 Thread Janek Sendrowski


Sorry, I formulated it wrong.

My problem is, that I want to count the ranges for every user, but if I use count(range), it counts the ranges of all users.

 

Janek Sendrowski

[GENERAL] how to use aggregate functions in this case

2013-08-25 Thread Janek Sendrowski

Hi,

Thats my code snipped:
 
SELECT  v_rec1.user,
        sum(CASE WHEN v_rec_fts.lev BETWEEN 0 AND 25 THEN 1 ELSE 0 END) as "0 
to 25",
        sum(CASE WHEN v_rec_fts.lev BETWEEN 25 AND 50 THEN 1 ELSE 0 END) as "25 
to 50",
sum(CASE WHEN v_rec_fts.lev BETWEEN 50 AND 100 THEN 1 ELSE 0 END) as 
"50 to 100"
        INTO v_rec2
        GROUP BY user;

Now I want to summuarize the "0 to 25" values and the others in the same query.
Somehow like this: count("0 to 25")
But I want to do it with every single user and I don't know how to do that

A result should look like this:

user  percentagecount
smith "0 to 25" 5
smith "25 to 50"7
smith "50 to 75"2
jones "0 to 25"     11
jones "25 to 50"1
jones "50 to 75"3

Hope someone who can help me

Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] performant import of an array

2013-08-23 Thread Janek Sendrowski

Hi,

I have an array and I want to import the data rowwise.

Would it make sense to use a function which calls a prepared statement in a 
loop to import the data performant?

Or can I somehow use the COPY FROM command?
 
Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Query on a record variable

2013-08-18 Thread Janek Sendrowski

Hi,

How can I do a query on a record variable in a function.
I want to do a dirty fulltextsearch on a table and then choose the string which 
have a low levenshtein-distance.
I wanted to it like this, but it doesn't work:

v_query := 'SELECT col FROM table WHERE LENGTH(dede) BETWEEN x AND y AND 
plainto_tsquery(string') @@ vectors';
EXECUTE v_query INTO v_rec;
v_query :=   'SELECT levenshtein('string', v_rec.col) AS lev WHERE 
levenshtein('string', v_rec.col) < 10';
EXECUTE v_query INTO v_rec2;

Or is there an easier way to do that. Maybe just to do one query at once?

Hope someone can help me.

Janek Sendrowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] devide and summarize sql result (all)

2013-08-16 Thread Janek Sendrowski


Thanks for your Answers,

 

my problem is, that there is no column with the name 'percentage'.

It's just a result of my query. So how can I use it anyway?

Should I just store the result in a record variable and do another query?

 

Janek

 

Gesendet: Freitag, 16. August 2013 um 00:24 Uhr
Von: bricklen 
An: "Janek Sendrowski" 
Cc: "pgsql-general@postgresql.org" 
Betreff: Re: [GENERAL] devide and summarize sql result (all)




On Thu, Aug 15, 2013 at 1:51 PM, Janek Sendrowski <jane...@web.de> wrote:




Hi,
 
My sql query results sth. like this:
 
user percentage
franz 78%
smith 98%
franz 81%
jason 79%

smith 89%

smith 85%

smith 99%

 

Now I'd like to summarize the percentages oder every user like this.

smith

2 matches 95-100%

2 matches 85-95%

0 mathes 75-85%

 

franz

0 mathes 95-100%

...






A CASE statement should work, if you are willing to hard-code the list of expressions.

SELECT  username,
    sum(case when avg between 76 and 85 then 1 else 0 end) as "76 to 85",
    sum(case when avg between 86 and 95 then 1 else 0 end) as "86 to 95",
    sum(case when avg > 95 then 1 else 0 end) as ">95"
FROM yourtable
GROUP BY username

[GENERAL] devide and summarize sql result (all)

2013-08-15 Thread Janek Sendrowski

Hi,
 
My sql query results sth. like this:
 
user percentage
franz 78%
smith 98%
franz 81%
jason 79%

smith 89%

smith 85%

smith 99%

 

Now I'd like to summarize the percentages oder every user like this.

smith

2 matches 95-100%

2 matches 85-95%

0 mathes 75-85%

 

franz

0 mathes 95-100%

...

 

Hope there is someone who can help me

 

Janek Sendrowksi

[GENERAL] devide and summarize sql result

2013-08-15 Thread Janek Sendrowski

Hi,
 
My sql query results sth. like this:
 
user  percentage
franz 78%
smith 98%
franz 81%
jason


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Fastest Index/Algorithm to find similar sentences

2013-07-27 Thread Janek Sendrowski

Hi Sergey Konoplev,
 
If I'm searching for a sentence like "The tiger is the largest cat species" for 
example.
 
I can only find the sentences, which include the words "tiger, largest, cat, 
species", but I also like to have the sentences with only three or even two of 
these words.
 
Janek


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Fastest Index/Algorithm to find similar sentences

2013-07-26 Thread Janek Sendrowski

Thanks for your answers.

@Amit Langote: I had a look and found out that pg_bigm doesn't support similar 
matches 

@Dann Corbit: The idea with the sequences makes sence. I had a look and I'm not 
sure, if they support similar sequences

Janek


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Fastest Index/Algorithm to find similar sentences

2013-07-25 Thread Janek Sendrowski

Hi,

I'm searching for an algorithm/Index to find similar sentences in a database.

The Fulltextsearch is not really suitable because it doesn't have a tolerance.

The Levenshtein-distance ist to slow.

I also tried pg_trgm module, which works with tri-grams, but it's also very 
slow with 100.000+ rows.

I hope someone can help, I can't really find sth. which is fast enough.

Best regards,
Janek
 
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Index for Levenshtein distance (better format)

2013-07-21 Thread Janek Sendrowski

Hi,

Im searching for a suitable Index for my query witch compares Strings with the 
Levenshtein distance.

I read that a prefix index would fit, but I dont know how to build it. I only 
know that its supported by Gist.

I couldn't find an instructions so I hope someone can help me.
 
Best regards,
Janek
 
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Index for Levenshtein distance

2013-07-20 Thread Janek Sendrowski

Hi,

I'm searching for a suitable Index for my query witch compares Strings with the Levenshtein distance.

I read that a prefix index would fit, but I don't know how to build it. I only know that its supported by Gist.

I couldn't find an instructions so I hope someone can help me.

 

Best regards,
Janek

49 matches

Mail list logo