[GENERAL] I don't understand something...

2011-10-02 Thread Alexander Pyhalov

Hello.
I was asked a simple question. We have table employees:
 \d employees
Table "public.employees"
 Column |Type | 
Modifiers

+-+-
 employee_id| integer | not null default 
nextval('employees_employee_id_seq'::regclass)

 first_name | character varying(20)   |
 last_name  | character varying(25)   | not null
 email  | character varying(25)   | not null
 phone_number   | character varying(20)   |
 hire_date  | timestamp without time zone | not null
 job_id | character varying(10)   | not null
 salary | numeric(8,2)|
 commission_pct | numeric(2,2)|
 manager_id | integer |
 department_id  | integer |
Indexes:
"employees_pkey" PRIMARY KEY, btree (employee_id)
"emp_email_uk" UNIQUE, btree (email)
"emp_department_ix" btree (department_id)
"emp_job_ix" btree (job_id)
"emp_manager_ix" btree (manager_id)
"emp_name_ix" btree (last_name, first_name)
Check constraints:
"emp_salary_min" CHECK (salary > 0::numeric)
Foreign-key constraints:
"employees_department_id_fkey" FOREIGN KEY (department_id) 
REFERENCES departments(department_id)

"employees_job_id_fkey" FOREIGN KEY (job_id) REFERENCES jobs(job_id)
"employees_manager_id_fkey" FOREIGN KEY (manager_id) REFERENCES 
employees(employee_id)

Referenced by:
TABLE "departments" CONSTRAINT "dept_mgr_fk" FOREIGN KEY 
(manager_id) REFERENCES employees(employee_id)
TABLE "employees" CONSTRAINT "employees_manager_id_fkey" FOREIGN 
KEY (manager_id) REFERENCES employees(employee_id)
TABLE "job_history" CONSTRAINT "job_history_employee_id_fkey" 
FOREIGN KEY (employee_id) REFERENCES employees(employee_id)


Now we want to select count of all employees who doesn't have any 
subordinates (query 1):
 SELECT count(employee_id) from employees o where not exists  (select 1 
from employees  where manager_id=o.employee_id);

 count
---
89
(1 row)

We can select count of all managers (query 2):
SELECT count(employee_id) from employees where employee_id  in (select 
manager_id from employees);

 count
---
18
(1 row)

But if we reformulate the first query in the same way, answer is 
different (query 3):
SELECT count(employee_id) from employees where employee_id not in 
(select manager_id from employees) (query 3);

 count
---
 0
(1 row)

I don't understand why queries 1 and 3 give different results. They 
seems to be the same... Could someone explain the difference?


--
Best regards,
Alexander Pyhalov,
system administrator of Computer Center of Southern Federal University

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_upgrade 8.4 -> 9.1 failures

2011-10-02 Thread Peter Eisentraut
On sön, 2011-10-02 at 15:45 -0400, Joseph S wrote:
> Mismatch of relation names: database "dbname", old rel
> pg_toast.pg_toast_1280475, new rel pg_toast.pg_toast_1202320
> Failure, exiting

This will be fixed in 9.1.2 (or get the code already from git).


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for "bare" letters

2011-10-02 Thread Eduardo Morras

At 01:25 02/10/2011, Reuven M. Lerner wrote:

Hi, everyone.  I'm working on a project on 
PostgreSQL 9.0 (soon to be upgraded to 9.1, 
given that we haven't yet launched).  The 
project will involve numerous text fields 
containing English, Spanish, and 
Portuguese.  Some of those text fields will be 
searchable by the user.  That's easy enough to 
do; for our purposes, I was planning to use some 
combination of LIKE searches; the database is 
small enough that this doesn't take very much 
time, and we don't expect the number of 
searchable records (or columns within those records) to be all that large.


The thing is, the people running the site want 
searches to work on what I'm calling (for lack 
of a better term) "bare" letters.  That is, if 
the user searches for "n", then the search 
should also match Spanish words containing 
"ñ".  I'm told by Spanish-speaking members of 
the team that this is how they would expect 
searches to work.  However, when I just did a 
quick test using a UTF-8 encoded 9.0 database, I 
found that PostgreSQL didn't  see the two 
characters as identical.  (I must say, this is 
the behavior that I would have expected, had the 
Spanish-speaking team member not said anything on the subject.)


So my question is whether I can somehow wrangle 
PostgreSQL into thinking that "n" and "ñ" are 
the same character for search purposes, or if I 
need to do something else -- use regexps, keep a 
"naked," searchable version of each column 
alongside the native one, or something else entirely -- to get this to work.


Any ideas?


You can use perceptual hashing for that. There 
are multiple algorithms, some of them can be tuned for specific languages.


See this documentation:

http://en.wikipedia.org/wiki/Phonetic_algorithm for a general description,

http://en.wikipedia.org/wiki/Soundex is the first one developed, very old,

http://en.wikipedia.org/wiki/Metaphone is a 
family of several modern algorithms.


Remember that they are hashing algorithms, some 
words can collide because they have the same pronunciation but write different.


I remember that datapark search engine uses them 
with dictionaries. You can check it too.


http://www.dataparksearch.org/


Thanks,

Reuven


HTH 




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_upgrade 8.4 -> 9.1 failures

2011-10-02 Thread Joseph S
I've tried the upgrade a few times, but I always get one type of failure
or another.

Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions   ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok
Checking for reg* system oid user data typesok
Checking for contrib/isn with bigint-passing mismatch   ok
Checking for large objects  warning

| Your installation contains large objects.
| The new database has an additional large object
| permission table.  After upgrading, you will be
| given a command to populate the pg_largeobject
| permission table with default permissions.

Creating catalog dump   ok

There were problems executing "/usr/local/pgsql_84/bin/pg_ctl" -w -l
"/dev/null" -D "data84"  stop >> "/dev/null" 2>&1
Failure, exiting

=

Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions   ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok
Checking for reg* system oid user data typesok
Checking for contrib/isn with bigint-passing mismatch   ok
Checking for large objects  warning

| Your installation contains large objects.
| The new database has an additional large object
| permission table.  After upgrading, you will be
| given a command to populate the pg_largeobject
| permission table with default permissions.

Creating catalog dump   ok
Checking for prepared transactions  ok
Checking for presence of required libraries ok

| If pg_upgrade fails after this point, you must
| re-initdb the new cluster before continuing.
| You will also need to remove the ".old" suffix
| from /home/postgres/data84/global/pg_control.old.

Performing Upgrade
--
Adding ".old" suffix to old global/pg_control   ok
Analyzing all rows in the new cluster   ok
Freezing all rows on the new clusterok
Deleting new commit clogs   ok
Copying old commit clogs to new server  ok
Setting next transaction id for new cluster ok
Resetting WAL archives  ok
Setting frozenxid counters in new cluster   ok
Creating databases in the new cluster   ok
Adding support functions to new cluster ok
Restoring database schema to new clusterok
Removing support functions from new cluster ok
Restoring user relation files

Mismatch of relation names: database "dbname", old rel
pg_toast.pg_toast_1280475, new rel pg_toast.pg_toast_1202320
Failure, exiting

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [Solved] Generic logging system for pre-hstore using plperl triggers

2011-10-02 Thread Diego Augusto Molina
2011/9/28, Merlin Moncure :
>
> I disagree.  unnest() and array_agg() (or, even better, array()
> constructor syntax) are an absolute joy to work with and thinking in a
> more functional way, which is usually the key to making things run
> quickly.  Also both functions are trivial to emulate in userland for
> compatibility.  Arrays of composites IIRC only go back to 8.3 so  that
> would be a true stopper for any solution in that vein.

Ok, tastes are tastes: I hate to make two or three more levels of
subqueries. Regarding arrays of composites, that would be perfectly
solved if we use no composite at all! Instead of a field with an array
of a composite of three instrinsics, use three fields, each of an
intrinsic type. See your proposal:

>> create type audit_field_t as (field text, old_value text, new_value text);

Instad, in the audit table you may use:

..., field smallint[], before text[], after text[],...

Note the smallint in field, that means I really want to keep the
reference to the "field" table. That is for the two reasons I had
mentioned earlier (to reduce space: 2 bytes of type "smallint" against
variable size of type "text"; and to keep track of names been used
too). You can also set up something like this if you like dimensions:

..., field smallint[], values text[][],...

Implying that the first dimension is the "before" value and the second
one is for the "after" value. Any of these prevents us from using
composites and makes the box a little wider and simpler. Even further,
I would like to keep the logging "on demand":

..., field smallint[], is_pk boolean[], { before text[], after text[]
| values text[][] },...

You know what are the braces and pipe for...
So, at the end, we have the entire "audet" table inside the "audit"
table, as a series of arrays. We got a real compact table with only
enough data to fully log the changes which triggered the event. No
less, no more.
At this point we know querying this table will be much more slow and
rotation will have to be done more frequently. If we dump>restore the
table somewhere else we will still be able to split the table in the
original two ones, and make indexes, cluster them, and query as
desired. But this can get so complicated that maybe I should implement
a function doing all this. In an event, we are getting less
responsiveness because of this. But a couple of mins more may not be a
problem for most cases. I'm just trying to summarize.

As a rule of thumb, you may need to run a cron job every night or so
to check if 'select count(*) from audit' is bigger than X then rotate
the table (or maybe each X days/weeks/etc.). The smaller the X, the
bigger responsiveness _in_ some cases: if we know an interval in time
we will just have to dump>restore those logs. In other cases this
would not be of much help: if you need to track a tupple to the very
beggining of the times, you'll have a lot of work to do
dumping>restoring (and so forth... remember to split the table,
indexing...). Still, rotation seems to be a good practice, and you can
include in the cron job the dump/restore part into another server and
then delete the old table. That would save a lot of space in your
production environment.

> As for the rest of it, I'd be looking to try and come up with an all
> sql implementation.  Also you should give an honest comparison between
> what you've come up with vs. this:
> http://pgfoundry.org/projects/tablelog/.
>
> merlin
>

"All SQL implementation"? Didn't we agree that's not possible in
pg<=8.4? then what do you mean by that?

About project "tablelog", I didn't really try it, but read it's
documentation and seems not appropiate at all for my case. First of
all, it's propose seems to be to log everything in a table to be able
to restore it later as of any time in the past. My propose is to log
to run analysis. Also, it needs to create one table per logged table,
consisting of the same structure of the logged table (without
constraints) plus three, four or five columns for control (depending
on usage, four or five recommended). I have a lot of tables to log
(hundreds!) with small changes to each of them; that means to
duplicate the amount of tables for a few changes. Speaking of
compactness... It also logs everything, not only changed values.
It is written in C, so I assume it runs much, much faster (specially
needed for highly transactional DBs). But it's not proven to be binary
safe (which I don't remember what that is). Bugs: nothing known.

So, if you need to be able to restore your table as of any time, use
tablelog. If you need to run analysis on who did what, use my option.


Finally attaching the code!

Cheers.

-- 
Diego Augusto Molina
diegoaugustomol...@gmail.com

ES: Por favor, evite adjuntar documentos de Microsoft Office. Serán
desestimados.
EN: Please, avoid attaching Microsoft Office documents. They shall be discarded.
LINK: http://www.gnu.org/philosophy/no-word-attachments.html

		/* Created by Diego Augusto Molina in 2011 for 

Re: [GENERAL] Searching for "bare" letters

2011-10-02 Thread Mike Christensen
>> I don't see the problem - you can have a dictionary, which does all work
>> on recognizing bare letters and output several versions. Have you seen
>> unaccent
>> dictionary ?
>
> This seems to be the direction that everyone is suggesting, and I'm quite
> grateful for that.  (I really hadn't ever needed to deal with such issues in
> the past, having worked mostly with English and Hebrew, which don't have
> such accent marks.)
>
> As for the unaccent dictionary, I hadn't heard of it before, but just saw it
> now in contrib, and it looks like it might fit perfectly.  I'll take a look;
> thanks for the suggestion.

I wrote this code for something similar I was doing, feel free to rip
it off or copy the regular expressions:


input = Regex.Replace(input, @"[\xC0-\xC5\xE0-\xE5]", "a");  //Replace with "a"
input = Regex.Replace(input, @"[\xC8-\xCB\xE8-\xEB]", "e");  //Replace with "e"
input = Regex.Replace(input, @"[\xCC-\xCF\xEC-\xEF]", "i");  //Replace with "i"
input = Regex.Replace(input, @"[\xD1\xF1]", "n");//Replace with "n"
input = Regex.Replace(input, @"[\xD2-\xD6\xF2-\xF6]", "o");  //Replace with "o"
input = Regex.Replace(input, @"[\xD9-\xDC\xF9-\xFC]", "u");  //Replace with "u"
input = Regex.Replace(input, @"[\xDD\xDF\xFF]", "y");//Replace with "y"

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why PGSQL has no developments in the .NET area?

2011-10-02 Thread Mike Christensen
>> PgSQL has just one old NPGSQL driver for .NET, which is itself sluggish.
>> The ODBC driver works better as compared to NPGSQL, but I suspect the ODBC
>> driver is not the right choice for ORM framework of .NET.
>>
>> I want to know whether there is any efficient .NET provider and is PGSQL
>> compatible with the .NET entity framework.
>>
>> Regards,
>> Rohit.
>
> Won't bite on the "why" of it all, but is this
>  what you're after?

We use the DevArt drivers (their Oracle drivers anyway) at work and
they're pretty good.  They have solid tech support (helpful and quick
to respond) and they seem to write quality stuff.  I'd assume their PG
drivers are just as good.  However, this stuff isn't cheap.  You pay
per dev license and it's like a few hundred bucks per dev.

I use Npgsql for my own website (www.kitchenpc.com) and I've had zero
problems with these drivers.  They're 100% managed code, and they seem
to be quick even though I haven't done any in-depth speed comparisons.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Updating 9.0.4 --> 9.1.1: How best to ???

2011-10-02 Thread Devrim GÜNDÜZ

Hi,

On Sun, 2011-10-02 at 12:41 +0200, r d wrote:
> 
> I decided to install the PGDG repo (pgdg-fedora91-9.1-5.noarch) to
> make it update. It does not.

Fedora/Red Hat distro RPMs a only binary compatible since last year. I
broke package compatibility by enabling multiple version installation.

> The FC15 packages are called "postgresql_*" and not "postgresql90_*",
> the PGDG packages are called "postgresql91_*".

See above.

> Since both repositories apparently get into each others way, my
> question is, what is the best (recommended) way to update or install
> postgres:
> 
> - Dump my FC15 installation and replace it with one coming from the
> PGDG repo?

This might be one of the options. However, as compared to Fedora RPMs,
we don't offer upgrade path. Fedora RPMs have a 

service (or systemctl... whatever) postgresql upgrade

option. So, an upgrade path could be:

* Replace current packages with PGDG packages. Please note that data
directory of 9.0 will be under /var/lib/pgsql/9.0, not /var/lib/pgsql.

* Make sure that db is working.

* Now, install 9.1 in parallel.

* Perform upgrade using pg_upgrade or pg_dump.

* Start 9.1

> - Wait for the 9.1 packages to appear in the Fedora Core repo? 

Fedora will release 9.1 in F-16. Fedora 15 won't have 9.1.

Regards,

-- 
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part


[GENERAL] Help message of PostgreSQL-win-x64 is UTF-8 encoded which is malformed

2011-10-02 Thread LiuYan
I downloaded both postgresql-9.1.1-1-windows-x64-binaries.zip and
postgresql-9.1.1-1-windows-binaries.zip, running on Simplified Chinese
Windows XP Professional x64.

The output of `psql --help`(x64) is UTF-8 encoded, they will be displayed as
malformed characters in command prompt window.
psql --help
psql鏄疨ostgreSQL 鐨勪氦浜掑紡瀹㈡埛绔伐鍏枫€?浣跨敤鏂规硶:

The output of `psql --help`(win32) is GBK encoded, which will be displayed
normally.
psql --help
psql是PostgreSQL 的交互式客户端工具。

Is it a bug of PostgreSQL win-x64?

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Help-message-of-PostgreSQL-win-x64-is-UTF-8-encoded-which-is-malformed-tp4861320p4861320.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread Leif Biberg Kristensen
On Sunday 2. October 2011 17.54.52 Raymond O'Donnell wrote:
> I may have missed it upthread, but if you haven't already would you
> consider writing up your solution for the benefit of the archives?

I did, in my own first reply to the original message:

SELECT * FROM foo WHERE bar LIKE E'%\xe2\x80\x8e%';

The trick is obviously to escape each byte in the sequence.

Maybe I'll write a blog post about it. It appears to be weakly documented, or 
at least very hard to find. Or maybe it's just me being dense.

regards, Leif.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread Raymond O'Donnell
On 02/10/2011 15:55, Leif Biberg Kristensen wrote:
> On Sunday 2. October 2011 16.34.27 Cédric Villemain wrote:
>> you may have miss this one :
>> http://tapoueh.org/blog/2010/02/23-getting-out-of-sql_ascii-part-2.html
> 
> That's an, uh, interesting article, but as far as I can see, it doesn't tell 
> anything about how to find a perfectly legal three-byte UTF-8 character that 
> doesn't have a counterpart in LATIN1, given that all I know about it is its 
> hexadecimal value.
> 
> I know how to do it now, and I consider the problem solved. Hopefully, this 
> thread may help others who stumbles upon the same issue.

I may have missed it upthread, but if you haven't already would you
consider writing up your solution for the benefit of the archives?

Ray.


-- 
Raymond O'Donnell :: Galway :: Ireland
r...@iol.ie

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for "bare" letters

2011-10-02 Thread Reuven M. Lerner

Hi, Oleg.  You wrote:


I don't see the problem - you can have a dictionary, which does all work
on recognizing bare letters and output several versions. Have you seen
unaccent
dictionary ?


This seems to be the direction that everyone is suggesting, and I'm 
quite grateful for that.  (I really hadn't ever needed to deal with such 
issues in the past, having worked mostly with English and Hebrew, which 
don't have such accent marks.)


As for the unaccent dictionary, I hadn't heard of it before, but just 
saw it now in contrib, and it looks like it might fit perfectly.  I'll 
take a look; thanks for the suggestion.


Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread Leif Biberg Kristensen
On Sunday 2. October 2011 16.34.27 Cédric Villemain wrote:
> you may have miss this one :
> http://tapoueh.org/blog/2010/02/23-getting-out-of-sql_ascii-part-2.html

That's an, uh, interesting article, but as far as I can see, it doesn't tell 
anything about how to find a perfectly legal three-byte UTF-8 character that 
doesn't have a counterpart in LATIN1, given that all I know about it is its 
hexadecimal value.

I know how to do it now, and I consider the problem solved. Hopefully, this 
thread may help others who stumbles upon the same issue.

regards, Leif

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PL/Python

2011-10-02 Thread Joe Abbate
Hi Sim,

On 10/02/2011 08:02 AM, Sim Zacks wrote:
> If I understand plpython correctly, it uses the python installed on the
> machine. In other words, plpythonu doesn't support the new style
> classes, it depends on what version of python is installed.

Well, Python has had new style classes since 2.2 (December 2001).  PG
8.2 release notes says it supports Python 2.5 and the 9.0 notes show
support was added for Python 3.  Unless someone is running Python 2.1 or
earlier, it seems new style classes are available.

> In the same way, if you have libraries installed on your machine,
> plpythonu has access to them as well. So if someone installed pyrseas,
> he would be able to use all the functions from within his function.

Yes, that seems to be the case, assuming the path to the library is
visible in (or added to) PYTHONPATH.

> IMO, there is no need for an extension here, all you need to do is have
> an sql file containing your 2 plpythonu functions that can be run into
> the database.

Maybe I'm misunderstanding something, but isn't such a sql file an
extension or is 95% of the way there?  Pyrseas is already distributed
via PGXN, but like some other PGXN "extensions" (dbi-link?), it doesn't
actually create functions in the database.  Its two utilities run
entirely as external programs.  If the Pyrseas functions were added via
an .sql file to a database, EXTENSION or not, they would be available
for use by non-Pyrseas programs, e.g., pgAdmin could call diff_map() to
compare database objects, Perl scripts or even a plain psql script could
call to_yaml().  And these would not depend on psycopg2, which currently
*is* a Pyrseas dependency (it would still be necessary for access to the
command line utilities).

Joe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread Cédric Villemain
2011/10/2 Leif Biberg Kristensen :
> On Sunday 2. October 2011 15.53.50 pasman pasmański wrote:
>> Its simple to remove strange chars  with regex_replace.
>
> True, but first you have to know how to represent a «strange char» in
> Postgresql :P
>
> It isn't all that obvious, and it's difficult to search for the solution. I
> tried a lot of different search terms in Google, and none of them turned up
> anything near what I needed.

you may have miss this one :
http://tapoueh.org/blog/2010/02/23-getting-out-of-sql_ascii-part-2.html


>
> regards, Leif
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread Leif Biberg Kristensen
On Sunday 2. October 2011 15.53.50 pasman pasmański wrote:
> Its simple to remove strange chars  with regex_replace.

True, but first you have to know how to represent a «strange char» in 
Postgresql :P

It isn't all that obvious, and it's difficult to search for the solution. I 
tried a lot of different search terms in Google, and none of them turned up 
anything near what I needed.

regards, Leif

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread pasman pasmański
Its simple to remove strange chars  with regex_replace.

2011/10/1, Leif Biberg Kristensen :
> On Saturday 1. October 2011 21.29.45 Andrew Sullivan wrote:
>> I see you found it, but note that it's _not_ a spurious UTF-8
>> character: it's a right-to-left mark, ans is a perfectly ok UTF-8 code
>> point.
>
> Andrew,
> thank you for your reply. Yes I know that this is a perfectly legal UTF-8
> character. It crept into my database as a result of a copy-and-paste job
> from
> a web site. The point is that it doesn't have a counterpart in ISO-8859-1 to
> which I regularly have to export the data.
>
> The offending character came from this URL:
> 
>
> and the text that I copied and pasted from the page looks like this in the
> source code:
>
> Aslaug Steinarsdotter Fjågesund  ‎(I2914)‎
>
> I'm going to write to the webmaster of the site and ask why that character,
> represented in the HTML as the ‎ entity, has to appear in a Norwegian
> web
> site which never should have to display text in anything but left-to-right
> order.
>
>> If you need a subset of the UTF-8 character set, you want to make sure
>> you have some sort of constraint in your application or your database
>> that prevents insertion of anything at all in UTF-8.  This is a need
>> people often forget when working in an internationalized setting,
>> because there's a lot of crap that comes from the client side in a
>> UTF-8 setting that might not come in other settings (like LATIN1).
>
> I don't want any constraint of that sort. I'm perfectly happy with UTF-8.
> And
> now that I've found out how to spot problematic characters that will crash
> my
> export script, it's really not an issue anymore. The character didn't print
> neither in psql nor in my PHP frontend, so I just removed the problematic
> text
> and re-entered it by hand. Problem solved.
>
> But thank you for the idea, I think that I will strip out at least any ‎
> entities from text entered into the database.
>
> By the way, is there a setting in psql that will output unprintable
> characters
> as question marks or something?
>
> regards, Leif.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


-- 

pasman

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PL/Python

2011-10-02 Thread Sim Zacks


  
  
On 09/30/2011 05:10 AM, Joe Abbate wrote:

  Although there are no discussions or examples in the documentation, I've
determined that PL/Python supports Python new style classes like class
Test(object), and import of standard modules.

Now, in order for to_yaml/to_map to do its work, it needs to import 15
modules, e.g.,

from pyrseas.dbobject.language import LanguageDict

I'd be interested to know if anyone has tried something similar (in
terms of scope) and if so, how they tackled it.

The other function, diff_yaml() or diff_map(), is a bit trickier because
it requires reading in a YAML spec file, like the one above, and then
comparing it to the internal version of to_yaml/to_map, in order to
output SQL DDL statements.  The command line tool could read the spec
file locally and send it off as one big text argument.  Again, I'm
interested in similar experiences if any.


If I understand plpython correctly, it uses the python installed on
the machine. In other words, plpythonu doesn't support the new style
classes, it depends on what version of python is installed.
In the same way, if you have libraries installed on your machine,
plpythonu has access to them as well. So if someone installed
pyrseas, he would be able to use all the functions from within his
function. 

IMO, there is no need for an extension here, all you need to do is
have an sql file containing your 2 plpythonu functions that can be
run into the database.

Sim 
  



Re: [GENERAL] Why PGSQL has no developments in the .NET area?

2011-10-02 Thread Andy Colson

On 10/01/2011 10:32 PM, Rohit Coder wrote:

PgSQL has just one old NPGSQL driver for .NET, which is itself sluggish. The 
ODBC driver works better as compared to NPGSQL, but I suspect the ODBC driver 
is not the right choice for ORM framework of .NET.

I want to know whether there is any efficient .NET provider and is PGSQL 
compatible with the .NET entity framework.

Regards,
Rohit.


why: probably because no .net user has stepped up and written one.  Not enough 
interest maybe?

But really, why would you need to?  I assume .net is not so crippled that it 
can't link to a .dll, correct?  Why not just use the native libpq.dll?  The 
interface is well documented and simple to use.  And you wont get any faster.

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Updating 9.0.4 --> 9.1.1: How best to ???

2011-10-02 Thread r d
Hi,

I would like to know which if any is a recommended install and upgrade
method.

My situation is summarized as:

I currently have Postgresql 9.0.4 installed via the Fedora Core FC15 yum
repo, and 9.1.x is not there (yet).

I decided to install the PGDG repo (pgdg-fedora91-9.1-5.noarch) to make it
update. It does not.

The FC15 packages are called "postgresql_*" and not "postgresql90_*", the
PGDG packages are called "postgresql91_*".


Since both repositories apparently get into each others way, my question is,
what is the best (recommended) way to update or install postgres:

- Dump my FC15 installation and replace it with one coming from the PGDG
repo?
- Wait for the 9.1 packages to appear in the Fedora Core repo?

I prefer install from repo as there is currently no reason for me to install
from source.

Thanks in advance,

RD


Re: [GENERAL] Searching for "bare" letters

2011-10-02 Thread Oleg Bartunov
I don't see the problem - you can have a dictionary, which does all work on 
recognizing bare letters and output several versions. Have you seen unaccent

dictionary ?

Oleg
On Sun, 2 Oct 2011, Uwe Schroeder wrote:


Hi, everyone.  Uwe wrote:

What kind of "client" are the users using?  I assume you will have some
kind of user interface. For me this is a typical job for a user
interface. The number of letters with "equivalents" in different
languages are extremely limited, so a simple matching routine in the
user interface should give you a way to issue the proper query.


The user interface will be via a Web application.  But we need to store
the data with the European characters, such as ?, so that we can display
them appropriately.  So much as I like your suggestion, we need to do
the opposite of what you're saying -- namely, take a bare letter, and
then search for letters with accents and such on them.

I am beginning to think that storing two versions of each name, one bare
and the other not, might be the easiest way to go.   But hey, I'm open
to more suggestions.

Reuven



That still doesn't hinder you from using a matching algorithm. Here a simple
example (to my understanding of the problem)
You have texts stored in the db both containing a n and a ?. Now a client
enters "n" on the website. What you want to do is look for both variations, so
"n" translates into "n" or "?".
There you have it. In the routine that receives the request you have a
matching method that matches on "n" (or any of the few other characters with
equivalents) and the routine will issue a query with a "xx like "%n%" or xx
like "%?%" (personally I would use ilike, since that eliminates the case
problem).

Since you're referring to a "name", I sure don't know the specifics of the
problem or data layout, but by what I know I think you can tackle this with a
rather primitive "match -> translate to" kind of algorithm.

One thing I'd not do: store duplicate versions. There's always a way to deal
with data the way it is. In my opinion storing different versions of the same
data just bloats a database in favor of a smarter way to deal with the initial
data.

Uwe






Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for "bare" letters

2011-10-02 Thread Uwe Schroeder
> Hi, everyone.  Uwe wrote:
> > What kind of "client" are the users using?  I assume you will have some
> > kind of user interface. For me this is a typical job for a user
> > interface. The number of letters with "equivalents" in different
> > languages are extremely limited, so a simple matching routine in the
> > user interface should give you a way to issue the proper query.
> 
> The user interface will be via a Web application.  But we need to store
> the data with the European characters, such as ñ, so that we can display
> them appropriately.  So much as I like your suggestion, we need to do
> the opposite of what you're saying -- namely, take a bare letter, and
> then search for letters with accents and such on them.
> 
> I am beginning to think that storing two versions of each name, one bare
> and the other not, might be the easiest way to go.   But hey, I'm open
> to more suggestions.
> 
> Reuven


That still doesn't hinder you from using a matching algorithm. Here a simple 
example (to my understanding of the problem)
You have texts stored in the db both containing a n and a ñ. Now a client 
enters "n" on the website. What you want to do is look for both variations, so 
"n" translates into "n" or "ñ".
There you have it. In the routine that receives the request you have a 
matching method that matches on "n" (or any of the few other characters with 
equivalents) and the routine will issue a query with a "xx like "%n%" or xx 
like "%ñ%" (personally I would use ilike, since that eliminates the case 
problem).

Since you're referring to a "name", I sure don't know the specifics of the 
problem or data layout, but by what I know I think you can tackle this with a 
rather primitive "match -> translate to" kind of algorithm.

One thing I'd not do: store duplicate versions. There's always a way to deal 
with data the way it is. In my opinion storing different versions of the same 
data just bloats a database in favor of a smarter way to deal with the initial 
data.

Uwe



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SQL Help - Finding Next Lowest Value of Current Row Value

2011-10-02 Thread Thomas Kellerer

Jeff Adams wrote on 01.10.2011 23:30:

Greetings,

I have a large table (~19 million records). Records contains a field
identifying a vessel and a field containing an time (epoch). Using the
current rows vessel and time values, I need to be able to find the next
lowest time value for the vessel and use it to compute how much time has
elapsed between the records. I have tried a scalar subquery in the SELECT,
which works, but it runs quite slowly. Is there an alternative approach that
might perform better for this type of query. Any information would be
greatly appreciated. Thanks...

Jeff


Something like:


SELECT vessel,
   time_column,
   time_column - lag(time_column) over (partition by vessel order by 
time_column) as diff
FROM your_table

No sure how good that performs though.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for "bare" letters

2011-10-02 Thread hamann . w
Reuven M. Lerner wrote:

>> Hi, everyone.  I'm working on a project on PostgreSQL 9.0 (soon
>>   to be upgraded to 9.1, given that we haven't yet launched).  The
>>   project will involve numerous text fields containing English,
>>   Spanish, and Portuguese.  Some of those text fields will be
>>   searchable by the user.  That's easy enough to do; for our
>>   purposes, I was planning to use some combination of LIKE searches;
>>   the database is small enough that this doesn't take very much
>>   time, and we don't expect the number of searchable records (or
>>   columns within those records) to be all that large.
>> The thing is, the people running the site want searches to work
>>   on what I'm calling (for lack of a better term) "bare" letters. 
>>   That is, if the user searches for "n", then the search should also
>>   match Spanish words containing "ñ".  I'm told by Spanish-speaking
>>   members of the team that this is how they would expect searches to
>>   work.  However, when I just did a quick test using a UTF-8 encoded
>>   9.0 database, I found that PostgreSQL didn't  see the two
>>   characters as identical.  (I must say, this is the behavior that I
>>   would have expected, had the Spanish-speaking team member not said
>>   anything on the subject.)
>> So my question is whether I can somehow wrangle PostgreSQL into
>>   thinking that "n" and "ñ" are the same character for search
>>   purposes, or if I need to do something else -- use regexps, keep a
>>   "naked," searchable version of each column alongside the native
>>   one, or something else entirely -- to get this to work.
>> Any ideas?
>> Thanks,
>> Reuven

I had the same problem with german (there is ä ö ü)
I ended up with a normalized version of the database (for many purposes, this 
could
be just an extra column) plus preprocessing the input.
There is one difficulty with german searches: these letters are commonly 
transliterated into
ue etc, like in "Muenchen". So depending on culture, some people would expect a 
"u" search
term to match, and others the "ue". So preprocessing query means replacing bare 
u
(not followed by e) with a ue? regex

BTW: if your search form does not explicitly tell the browser to use utf8 to 
encode the search field,
you might expect a small proportion of iso-latin1 requests

Regards
Wolfgang




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for "bare" letters

2011-10-02 Thread Reuven M. Lerner

Hi, everyone.  Uwe wrote:


What kind of "client" are the users using?  I assume you will have some kind
of user interface. For me this is a typical job for a user interface. The
number of letters with "equivalents" in different languages are extremely
limited, so a simple matching routine in the user interface should give you a
way to issue the proper query.


The user interface will be via a Web application.  But we need to store 
the data with the European characters, such as ñ, so that we can display 
them appropriately.  So much as I like your suggestion, we need to do 
the opposite of what you're saying -- namely, take a bare letter, and 
then search for letters with accents and such on them.


I am beginning to think that storing two versions of each name, one bare 
and the other not, might be the easiest way to go.   But hey, I'm open 
to more suggestions.


Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general