[GENERAL] C++, Postgres , libpqxx huge query

2014-05-04 Thread alexandros_e
Hello experts. I have posted this question on stack overflow, but I did not
get any detailed answer, so I thought I should cross post here. My
apologies.

I have to execute an SQL query to Postgres by the following code. The query
returns a huge number of rows (40M or more) and has 4 integer fields: When I
use a workstation with 32Gb everything works but on a 16Gb workstation the
query is very slow (due to swapping I guess). Is there any way to tell the
C++ to load rows at batches, without waiting the entire dataset? With Java I
never had these issues before, due to the probably better JDBC driver.

try {
work W(*Conn);
result r = W.exec(sql[sqlLoad]);
W.commit();

for (int rownum = 0; rownum  r.size(); ++rownum) {
const result::tuple row = r[rownum];
vid1 = row[0].asint();
vid2 = row[1].asint();
vid3 = row[2].asint();
. 

} catch (const std::exception e) {
std::cerr  e.what()  std::endl;
}

I am using PostgreSQL 9.3 and there I see this
http://www.postgresql.org/docs/9.3/static/libpq-single-row-mode.html, but I
do not how to use it on my C++ code. Your help will be appreciated.

EDIT: This query runs only once, for creating the necessary main memory data
structures. As such, tt cannot be optimized. Also, pgAdminIII could easily
fetch those rows, in under one minute on the same (or with smaller RAM) PCs.
Also, Java could easily handle twice the number of rows (with
Statent.setFetchSize()
http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#setFetchSize%28int%29)
So, it is really an issue for the libpqxx library and not an application
issue. Is there a way to enforce this functionality in C++, without
explicitly setting limits / offsets manually? Also, is there another driver
I should use with C++ that offers this kind of functionality?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/C-Postgres-libpqxx-huge-query-tp5802330.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] C++, Postgres , libpqxx huge query

2014-05-04 Thread alexandros_e
To answer my own question, I adapted How to use pqxx::stateless_cursor class
from libpqxx?

try {
work W(*Conn);
pqxx::stateless_cursorpqxx::cursor_base::read_only,
pqxx::cursor_base::owned
cursor(W, sql[sqlLoad], mycursor, false);
/* Assume you know total number of records returned */
for (size_t idx = 0; idx  countRecords; idx += 10) {
/* Fetch 100,000 records at a time */
result r = cursor.retrieve(idx, idx + 10);
for (int rownum = 0; rownum  r.size(); ++rownum) {
const result::tuple row = r[rownum];
vid1 = row[0].asint();
vid2 = row[1].asint();
vid3 = row[2].asint();
.
}
}
} catch (const std::exception e) {
std::cerr  e.what()  std::endl;
}





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/C-Postgres-libpqxx-huge-query-tp5802330p5802392.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] High Level Committers Wanted

2014-03-12 Thread alexandros_e
It seems like spam to me. Where is the guy's name or credentials? If I would
request something, I would sign with my name, government email and
telephone.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/High-Level-Committers-Wanted-tp5795695p5795703.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_trgm extension and theory

2014-02-22 Thread alexandros_e
Hello to all experts,

I am considering of using pg_trgm extension in a research publication, since
initial results seem promising. The index seems to works pretty fast for
finding similar text and significantly accelerate query time. The problem is
that I do not know the theory behind it or the exact method it uses. 
My questions: 
a) It probably uses the q-grams method (basically 3 grams only). Does it
also create 2 grams and 1 grams to determine similarity?
b) About the index (either gist on gin). Is it based on RD-tree? If not what
is the exact indexing method it uses?
c) Will it work for any UTF8 characters / strings because the documentation
says for ASCII. 
d) I also found the http://pgsimilarity.projects.pgfoundry.org/ project who
does similarity functions for string. Does pg_trgm extension have anything
to do with that? Since pgsimilarity seems abandoned is there another project
that a) uses some kind of indexing for similarity b) provides most functions
for string similarity  like pgsimilarity?

Thanks



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/pg-trgm-extension-and-theory-tp5793180.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Pg_trgm and invalid invalid byte sequence for encoding UTF8

2014-02-12 Thread alexandros_e
Hello experts,

I want to compare integer arrays basically with methods based on string
similarity (i.e., levenshtein, trigrams etc).. In order to do that I hacked
a custom function that converts those integer array to strings, where each
integer is converted to a character by the function CHR(my_array1[i]+64) (so
that 1-A, 2 -B etc). This hack of course for large integers (I have
integers up to 300,000) probably creates invalid UTF-8 characters.
Levenshtein (from fuzzystrmatch module) does not seem to have a problem with
that and works perfectly, since it is based on just comparing UTF8 codes. On
the other hand when I try similarity function
array1-array1  for some cases it works (I think it works for all integers
up to 4096) but for some larger indexes I get invalid byte sequence for
encoding UTF8 errors:

Example integer sequence

8527,63586,8526,63585,63584,63583,63582,8525,8760,63820,63821,63822,860,57610,861,57611,862,57612,57613,863,57614,57615,57616,39850,39851,39852,39853,39854,39855,95275,39856,39857,95276,95277,39858,95278,95279,39859,95280,39860,95281,95282,39861,39862,39863,95283,95284,27095,27096,82406,82407,27097,27098,27099,27100,82408,27101,27102,27103,25702,80837,25703,25704,80838,25705,25706,25707,25708,30011,85343,30012,85344,30013,30014,51019,48260,48261,56809,56810,56811,56812,113829,31762,87568,31763,45925,41778,41779,41780,31778,31779,87571};

Error message:

invalid byte sequence for encoding UTF8: 0xed 0xb8 0xa9

Is there a way to suppress these errors similar to levenshtein which does
not care about validity of UTF characters?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Pg-trgm-and-invalid-invalid-byte-sequence-for-encoding-UTF8-tp5791681.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] client that supports editing views

2014-02-08 Thread alexandros_e
Why not use Access directly? You can connect to PostgreSQL though ODBC
(inside Access) and do the forms there. I am sure 95% that I had done that
in the past, although I cannot be sure it works with the most recent
versions. 

Something like that:
http://www.depts.ttu.edu/ithelpcentral/solutions/odbcsetup.php



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/client-that-supports-editing-views-tp5782418p5791070.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dynamic insert into ARRAY? plpgsql

2014-02-08 Thread alexandros_e
You must a) join the 2 tables on the orderID ... where orderID=15 and then
GROUP BY the result by the order ID and concat the orderlines by a custom
aggregate function like:
http://www.postgresql.org/message-id/db7789b.0309131210.625da...@posting.google.com
 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Dynamic-insert-into-ARRAY-plpgsql-tp5791090p5791093.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Temporary table already exists

2014-02-06 Thread alexandros_e
@mephysto I think you are trying to solve the wrong type of problem. Creation
of tables (temporary or not) are not supposed to run concurrently. So, this
is not an issue of PostgreSQL but design. There are two ways to solve the
problem.

a) You could use the sessionID (provided The Glassfish server) to create
unique names for the temporary table, if the temporary table is unique per
session. If not, since you are talking about multiplayer game the temporary
table name could include the userID, so it is unique per user. In that
sense, it could be persistent per user, or temporary depending on your
needs. In that case you will need EXECUTE command to CREATE the table in
your pgsql code, since the table name must be a variable in your functions.  

b) I highly suspect that this temporary table is used for either caching or
local sorting / ordering. In this case Postgres is again the wrong tool. You
can use an embedded DB like SQLite, HSQLDB for the local database which may
be used for this type of operations, which 'syncs' to the global PostgreSQL
DB when connecting or disconnecting. Every client will have a separate copy
of this DB, so no overhead to the server.

Either way you will have solved your issue. 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Temporary-table-already-exists-tp5789852p5790806.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Hard upgrade (everything)

2014-02-06 Thread alexandros_e
I would use normal pg_dump and pg_restore for the DBs and not
utils/postgis_restore.pl. Also, AFTER I backup all databases and everything
else, you could try to upgrade Postgis without upgrading PostgreSQL by
buliding from source e.g.
http://trac.osgeo.org/postgis/wiki/UsersWikiPostGIS21Ubuntu1304src. There it
says that Postgis 2.1 could be used with PostgreSQL 9.1. This will give you
access to Postgis 2.1 features without reinstalling everything.

Of course normally I would not upgrade if this is an 1-2 years project,
unless I 100% need Postgis 2.1 features not present in 1.5. 





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hard-upgrade-everything-tp5790801p5790811.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgresql GROUP BY SIMILAR but not equal values

2014-02-06 Thread alexandros_e
I wanted to ask you the following question to all experts here.

Let's say I have this table foo

ID|G1|T1|
1|2|ABC|
1|2|ABCD|
1|2|DEF|
1|2|DEFG|

SELECT * FROM foo
GROUP BY ID,G1,T1

RETURNS exactly the same table.

Is there a way in SQL or PostgreSQL in general to group by values than are
not exactly the same but are quite similar (like 'ABC' and 'ABCD') based on
some distance function (levenshtein for example) if the distance is within
some threshold (i.e., 1)

My intuition is that SQL cannot support such queries but I was wondering if
there was some hack around it. The problem as I see it that distance
functions require 2 values but GROUP BY only checks equality. 

Another subproblem that might help is can we overload an operator of a
custom type, so that equals operator is more relaxed and is calculated by
a function?  OR can we use GROUP BY with a custom comparator for a data
type?

I hope my question makes sense.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Postgresql-GROUP-BY-SIMILAR-but-not-equal-values-tp5790860.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql GROUP BY SIMILAR but not equal values

2014-02-06 Thread alexandros_e
Conceptually, Tom (as always) is right. But Alban's hack help.

DROP TABLE foo;
CREATE TABLE IF NOT EXISTS foo(ID INTEGER,G1 INTEGER, T1 TEXT, ID2 SERIAL
PRIMARY KEY);

INSERT INTO foo(ID,G1,T1) VALUES(1,2,'ABC');
INSERT INTO foo(ID,G1,T1) VALUES(1,2,'ABCD');
INSERT INTO foo(ID,G1,T1) VALUES(1,2,'ABDC');
INSERT INTO foo(ID,G1,T1) VALUES(1,2,'DEF');
INSERT INTO foo(ID,G1,T1) VALUES(1,2,'DEFH');

/* A little editing to remove duplicates a to b and b to a */
SELECT foo1.ID, foo1.G1, foo1.T1, levenshtein(foo1.T1, foo2.T1),foo2.T1
FROM foo foo1
INNER JOIN foo foo2 ON (foo2.ID = foo1.ID AND foo2.G1 = foo1.G1)
WHERE foo1.ID2foo2.ID2
GROUP BY foo1.ID, foo1.G1, foo1.T1, levenshtein(foo1.T1, foo2.T1),foo2.T1
HAVING levenshtein(foo1.T1, foo2.T1) 2;

RETURNS

ID|G1|foo1.T1|foo2.T1
1;2;ABC;1;ABCD
1;2;ABC;1;ABDC
1;2;ABCD;2;ABDC
1;2;DEF;1;DEFH

Then it requires a second grouping but as Tom suggested it would be hard to
somehow group all similar cases together because then it becomes a
clustering problem. With a second grouping we will have 3 records instead of
4, so it is better than the initial case by 25%.   




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Postgresql-GROUP-BY-SIMILAR-but-not-equal-values-tp5790860p5790876.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgresql multidimensional arrays cast fail

2014-02-01 Thread alexandros_e
I do:

SELECT '{{1,2},{3,4}}'::INTEGER[][]

But I get:

{{1,2},{3,4}} INTEGER[]. Somehow the PostgreSQL server does not understand
that is a multidimensional array. So, later if I want to get {1,2} or {3,4},
the field[1] or field[2]. Evem when I try:

field [1:1] I get {{1,2}} and not plain one dimensional integer array {1,2}
which I want. How do I achieve that?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Postgresql-multidimensional-arrays-cast-fail-tp5790077.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql multidimensional arrays cast fail

2014-02-01 Thread alexandros_e
This is probably the only way to do it. Still, it seems to me an overkill if
basically you need to run a function at each multidimensional array to get
access to each i-array element. Thanks for your answer. 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Postgresql-multidimensional-arrays-cast-fail-tp5790077p5790095.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reindexing and tablespaces

2014-01-31 Thread alexandros_e
You were right. I just reindexed the DB and I saw no real changes in drive
storage (those tablespaces are on separate hard disk volumes). Thanks



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Reindexing-and-tablespaces-tp5789827p5789853.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Reindexing and tablespaces

2014-01-30 Thread alexandros_e
Hello to all,

I have done ALTER DATABASE [database_name] SET default_tablespace =
[new_tablespace]; I am wondering, if I  reindex this entire DB would the
indexes automatically moved into the [new_tablespace] or will they remain in
the tablespace they were originally created on;



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Reindexing-and-tablespaces-tp5789827.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] levenshtein_less_equal does not work

2014-01-24 Thread alexandros_e
Hello to all,

I always use PostgreSQL from enterpriseDB. I have both 9.2 and 9.1 latest
verions installed on two different Ubuntu 64bit machines. For some reason
levenshtein_less_equal does not work, in contrast to levenshtein which works
as expected. I even downloaded the src code for PostgreSQL code for
fuzzystrmatch and compiled myself, added the so file to the correct place
and still it does not work. The process was correct, because I have
increased the levenshtein limit from 255 to 2048.

Can you tell me how to make levenshtein_less_equal. Are those mysterious
IFDEF levenshtein_less_equal that are messing compilation?

Thanks,

Alexandros 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/levenshtein-less-equal-does-not-work-tp5788710.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] levenshtein_less_equal does not work

2014-01-24 Thread alexandros_e
Query:

SELECT levenshtein_less_equal('extensive', 'exhaustive',2);

ERROR:  function levenshtein_less_equal(unknown, unknown, integer) does not
exist
LINE 1: SELECT levenshtein_less_equal('extensive', 'exhaustive',2);
   ^
HINT:  No function matches the given name and argument types. You might need
to add explicit type casts.

Even this: 

SELECT levenshtein_less_equal('extensive'::TEXT,
'exhaustive'::TEXT,2::INTEGER);
function levenshtein_less_equal(text, text, integer) does not exist

But this works perfectly:
SELECT levenshtein('extensive', 'exhaustive');
4





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/levenshtein-less-equal-does-not-work-tp5788710p5788713.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] levenshtein_less_equal does not work

2014-01-24 Thread alexandros_e
Just this:

create extension fuzzystrmatch; 

If the extension was not installed, then the simple levenstein distance
would not work either.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/levenshtein-less-equal-does-not-work-tp5788710p5788715.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] levenshtein_less_equal does not work

2014-01-24 Thread alexandros_e
SELECT version(); 

and 

\df  levenshtein_less_equal 



PostgreSQL 9.1.11 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2
20080704 (Red Hat 4.1.2-52), 64-bit

routing_nw2=# \df  levenshtein_less_equal
   List of functions
 Schema | Name | Result data type | Argument data types | Type 
+--+--+-+--
(0 rows)

routing_nw2=# 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/levenshtein-less-equal-does-not-work-tp5788710p5788717.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] levenshtein_less_equal does not work

2014-01-24 Thread alexandros_e
=# \dx
 List of installed extensions
  Name   | Version |   Schema   | Description  
-+-++--
 plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(1 row)

-# \df  levenshtein*
   List of functions
 Schema |Name | Result data type | Argument data types |  Type  
+-+--+-+
 public | levenshtein | integer  | text, text  | normal
(1 row)


The server is in localhost. The binaries are from enterpriseDB. I also
downloaded src code for Postgres 9.1.11. Recompiled fuzzystrmatch, which
includes levenstein less_equal. Replaced .so file in Postgres installation
dir. Change in levenshtein limit of character successful. 

It is not a compilation error, because I do not get  levenstein less_equal
in the Postgres 9.2 server which is exactly as  downloaded from
EnterpriseDB.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/levenshtein-less-equal-does-not-work-tp5788710p5788723.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] levenshtein_less_equal does not work

2014-01-24 Thread alexandros_e
With your help I found the solution. My DB was initially based on a 9.0
server and I had restored it on the 9.1 server. For some reason 9.0
fuzzystrmatch functions (levenstein...) were already present in this DB. So,
CREATE EXTENSION fuzzystrmatch did not work. I had to manually remove
function 9.0 fuzzystrmatch functions and then CREATE EXTENSION  worked
perfectly and levenstein_less_equal as well. So, it was error from the DB
restored from an older version of postgresql.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/levenshtein-less-equal-does-not-work-tp5788710p5788729.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general