Re: [SQL] Odd sort behaviour

2009-09-02 Thread Lew

Rob Sargent wrote:
I'm sure this a life-time's worth of discussion on the merits of 
treating "."

as nothing when sorting


Well, every sorted reference work in society at large seems to have a 
different idea of how to sort - just compare the phone book to the dictionary. 
 That's the point of locales, to formalize such rules so that you can coerce 
your system to follow one or another set as needed.  That way you don't have 
to agree or disagree with any rule, such as ignoring punctuation in the sort, 
simply be aware of whether it applies to any given situation.


--
Lew

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Hibernate, web application and only one sequence for all primary keys

2009-09-19 Thread Lew

rawi wrote:

Grails/Hibernate wishes per default one sequence for all tables-PKs and all
PKs as BigInt.


How is that a Hibernate default?

Hibernate lets you define a multitude of types as a primary key, and the 
sequence each uses is a matter of XML or annotation configuration, at least in 
the Java version of Hibernate which is the only form of it that I've used. 
I've used both "old-fashioned" Hibernate with *.hbm.xml mapping descriptors, 
and the new-fangled JPA (Java Persistence API) version.


I've used Hibernate with String and (long) integer key types, sequenced and 
not.  For my learning, I use a system on Linux with Java and PostgreSQL.  It 
works just fine.


I'm not familiar with Grails.

--
Lew

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Common table expression - parsing questions

2009-10-05 Thread Lew

(top-posting corrected)

Thomas Kellerer wrote:

The standard *requires* the keyword.


the6campbells wrote:

the db2 [sic] family does not


Ergo DB2 is not standard-compliant in that regard.

--
Lew

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] is there a distinct function for comma lists ?

2010-09-07 Thread Lew

On 09/07/2010 07:52 AM, Andreas wrote:

Hi,
is there a distinct function for comma separated lists ?

I sometimes need to update tables where I got a set of IDs, like:

update mytable
set someattribute = 42
where mytable.id in
( 1, 2, 3, 5, 7, 11, 3, 6, 13, 13, 3, 11 ... )

So there are double entries in the list but in this case its just
overhead but no problem.

But for calculated values this would not allways be desirable.

update mytable
set someattribute = someattribute + 1
where mytable.id in
( 1, 2, 3, 5, 7, 11, 3, 6, 13, 13, 3, 11 ... )

How could I get a distinct list? Those lists can have 2000-3000 IDs
sometimes.

One solution was as follows but perhaps there is something more elegant?

update mytable
set someattribute = someattribute + 1
where mytable.id in
( select distinct id from mytable where id in ( 1, 2, 3, 5, 7, 11, 3, 6,
13, 13, 3, 11 ... ) )


I am not clear on what you're asking here.  From what you say, there's nothing 
to do.  The two forms of the SQL you show have the same result.


The fact that 11 or 13 or whatever appear in the IN list more than once 
doesn't affect the result of the query; 13 is in the IN list no matter how 
many times (> 0) that 13 appears in the IN list.  So a row from mytable with 
id=13 is selected regardless.  It's not like the row will be selected more 
than once.


From the manual:
'The result of IN is "true" if any equal subquery row is found.'
<http://www.postgresql.org/docs/8.4/interactive/functions-subquery.html>

It's still true of more than one equal subquery row is found.  It's not true 
multiple times, it's just true.


If mytable.id is not unique, then every row with that value will be selected, 
but adding DISTINCT to the IN list won't change that either.


--
Lew

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Question regarding indices

2010-09-12 Thread Lew

On 09/11/2010 08:29 AM, Steve wrote:

I have a small question about the order of values in a query.
Assume I have a table with the following fields:
   uid INT,
   data BIGINT,
   hits INT
And an unique index on (uid, data). I use libpq C API to query
data from the table. The query is something like this:
SELECT uid,data,hits FROM mytable WHERE uid=2
AND data IN (2033,2499,590,19,201,659)

Would the speed of the query be influenced if I would sort the data?


What do you mean by "sort the data"?  Which data?


I can imagine that just querying a bunch of bigint would not make a
big difference but what about several thousand of values? Would sorting
them and sending the SQL query with ordered data influence the speed of the 
query?


Send the query from where to where?

Are you referring to a sort of the items in the IN subselect?  My guess is 
that sorting that won't matter but it's only a WAG.


--
Lew

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] group by hour + distinct

2010-11-26 Thread Lew

Michele Petrazzo - Unipex wrote:

P.s. Have you some references about the "subquery" keyword? I found only
the word subquery as "use" (for example: select a from b where id in
(select id from table)), but not as sql [sic] command.


"subquery" is not an SQL keyword.
<http://www.postgresql.org/docs/9.0/interactive/sql-keywords-appendix.html>

Nor is it a command all by itself.

Reading the documentation might help you:
<http://www.postgresql.org/docs/9.0/interactive/queries-table-expressions.html#QUERIES-SUBQUERIES>
<http://www.postgresql.org/docs/9.0/interactive/sql-expressions.html#SQL-SYNTAX-SCALAR-SUBQUERIES>
<http://www.postgresql.org/docs/9.0/interactive/functions-subquery.html>

Or try:
<http://lmgtfy.com/?q=SQL+subquery>

--
Lew

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] check constraint bug?

2011-05-08 Thread Lew

Scott Marlowe wrote:

Tarlika Elisabeth Schmitz wrote:

I specified:

ALTER TABLE h ADD CONSTRAINT val_h_stats
CHECK (NOT (sex = 'f') AND (stats IS NOT NULL));

which was translated to:

ALTER TABLE h ADD CONSTRAINT val_h_stats
CHECK (NOT sex = 'f'::bpchar AND stats IS NOT NULL);



You need another level of parens:

CHECK (NOT ((sex = 'f') AND (stats IS NOT NULL)));


Because NOT has higher precedence than AND.

<http://www.postgresql.org/docs/9.0/interactive/sql-syntax-lexical.html#SQL-PRECEDENCE>

Note that equals (=), IS and NOTNULL have higher precedence than NOT.

So the CHECK expression Scott indicated is equivalent to the parenthesis-minimal

 CHECK ( NOT ( sex = 'f' AND stats IS NOT NULL ) )

or

 CHECK ( sex != 'f' OR stats IS NULL )

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Dates and NULL's`

2011-05-10 Thread Lew

On 05/10/2011 12:48 PM, John Fabiani wrote:

Hi,
Maybe this is a dumb question but if I have a date field that contains a NULL
will it show up when I ask for a where date range for the same date field.

Where mydate>= "2011/04/01"::date and mydate<= "2011/04/30"::date

With the above where will the NULL's be selected


How many question marks does it take to indicate an interrogative?


I ask because I was always told that a NULL matches everything and nothing!


That's not a useful viewpoint.  The useful point is that NULL matches nothing.

It's a simple three-valued logic with NULL standing in for UNKNOWN:
<http://www.postgresql.org/docs/9.0/interactive/functions-logical.html>

The WHERE clause only selects rows for which the clause evaluates to TRUE.

So in the WHERE clause, both
 NULL > "2011/04/01"::date
and
 NULL <= "2011/04/30"::date
will fail, and so would
 NULL <= "2011/04/01"::date
, since none of these evaluates to TRUE.

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] replace " with nothing

2011-05-12 Thread Lew

Tony Capobianco wrote:

We are converting from Oracle to Postgres.  An Oracle script contains
this line:

  select replace(firstname,'"'), memberid, emailaddress from members;

in an effort to replace the " with nothing.  How can I achieve the same
result with Postgres?

Here's the Postgres error I get:

select replace(firstname,'"'), memberid, emailaddress from members;
ERROR:  function replace(character varying, unknown) does not exist
LINE 1: select replace(firstname,'"'), memberid, emailaddress from m...


<http://www.postgresql.org/docs/9.0/interactive/functions-string.html>
<http://www.postgresql.org/docs/9.0/interactive/functions-string.html#FUNCTIONS-STRING-OTHER>

replace(string text, from text, to text)

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] extracting location info from string

2011-05-23 Thread Lew

On 05/22/2011 09:42 PM, Craig Ringer wrote:

On 23/05/2011 9:11 AM, Andrej wrote:

On 23 May 2011 10:00, Tarlika Elisabeth Schmitz
 wrote:

On Sun, 22 May 2011 21:05:26 +0100
Tarlika Elisabeth Schmitz wrote:


A column contains location information, which may contain any of the
following:

1) null
2) country name (e.g. "France")
3) city name, region name (e.g. "Bonn, Nordrhein-Westfalen")
4) city name, Rg. region name (e.g. "Frankfurt, Rg. Hessen")
5) city name, Rg region name (e.g. "Frankfurt, Rg Hessen")



I also need to cope with variations of COUNTRY.NAME and REGION.NAME.


That isn't a table structure, that's a freeform text structure.  You didn't 
state your question, Tarlika, but your database structure is terrible.  For 
example, "region" and "country" should be different columns.  Really!


How you get your raw data into those columns can be interesting.


This is a hard problem. You're dealing with free-form data that might be
easily understood by humans, but relies on various contextual information and
knowledge that makes it really hard for computers to understand.

If you want to do a good job of this, your best bet is to plug in 3rd party
address analysis software that is dedicated to this task. Most (all?) such


These aren't really addresses, as the OP presents them.


packages are commercial, proprietary affairs. They exist because it's really,
really hard to do this right.


Another thing of great import is whether the city can occur in the
data column all by itself; if yes, it's next to impossible to distinguish
it from a country.


Not least because some places are both, eg:

Luxembourg
The Vatican
Singapore

(The Grand Duchy of Luxembourg has other cities, but still serves as an 
example).


And,of course, you have to distinguish the City of London from London.  New 
York City comprises five boroughs (counties), each of which is itself a city. 
 (Brooklyn is one of the largest cities in the world all by itself.) 
"Region" has different meanings in different areas - it can mean part of a 
county, or state / province, or nation, or continent.  "The Baltic region", 
"the Northeast", "upstate", "the North Country", "Europe" are all regions.


The OP should share more about the semantics of their problem domain and 
whether they really intend those table structures to be table structures.  Really?


--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] extracting location info from string

2011-05-25 Thread Lew

Tarlika Elisabeth Schmitz wrote:

Lew wrote:

That isn't a table structure, that's a freeform text structure.  You
didn't state your question, Tarlika, but your database structure is
terrible.  For example, "region" and "country" should be different
columns.



I presume you are referring to my original post:
CREATE TABLE person
(
   id integer NOT NULL,
   "name" character varying(256) NOT NULL,
   "location" character varying(256),
   CONSTRAINT person_pkey PRIMARY KEY (id)
);

Sorry, this was just a TEMPORARY table I created for quick analysis of
my CSV data (now renamed to temp_person).



The target table is:
CREATE TABLE person
(
   id integer NOT NULL,
   "name" character varying(100) NOT NULL,
   country character varying(3),
   county character varying(3),
   town character varying(50),
   CONSTRAINT trainer_pkey PRIMARY KEY (id),
   CONSTRAINT country_person_fk FOREIGN KEY (country)
   REFERENCES country (id) MATCH SIMPLE
   ON UPDATE CASCADE ON DELETE RESTRICT,
   CONSTRAINT county_person_fk FOREIGN KEY (country, county)
   REFERENCES county (country, code) MATCH SIMPLE
   ON UPDATE NO ACTION ON DELETE NO ACTION,
);


Ah, yes, that makes much more sense.  Temporary tables such as you describe 
can be very convenient and effective.  Thanks for the clarification.


I think this problem is very widespread, namely how to get structured 
information out of freeform data.  I've encountered it many times over the 
years, as have so many I know.  I believe that human intervention will always 
be needed for this type of work, e.g., distinguishing place names that seem 
the same or correlating ones that seem distinct.  I also don't know of any 
perfect approach.  Perhaps the best one can find is a probabilistic promise 
that error will be less than some epsilon.


That said, if you have a robust process to correct errors as the user 
population discovers them, then you can approach perfection asymptotically. 
Sometimes the best solution to a technical problem is a good human process. 
From an engineering standpoint, user feedback is a vital element of 
homeostatic control.


Edward W. Rouse's suggestion of a reference table to resolve different forms 
of address or region identification would fit well with such a process.


--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Query question

2012-01-28 Thread Lew

On 01/26/2012 04:00 AM, John Tuliao wrote:

I seem to have a problem with a specific query:

The inside query seems to work on it's own:

select prefix
from john_prefix
where strpos(jpt_test.number,john_prefix.prefix) = '1'
order by char_length(john_prefix.prefix) desc limit 1

but when I execute it with this:

UPDATE
jpt_test
set
number = substring(number from length(john_prefix.prefix)+1)
from
john_prefix
where
prefix in (
select prefix
from john_prefix
where strpos(jpt_test.number,john_prefix.prefix) = '1'
order by char_length(john_prefix.prefix) desc limit 1
) ;

table contents are as follows

john_prefix table:

prefix
-
123
234

jpt_test table:

number
---
123799
023499 <<< supposed to have no match
234999

Am I missing something here? Any help will be appreciated.


I'm going to guess that it's because you didn't use a separate alias for the 
FROM in the correlated subquery.


Doesn't STRPOS() return INTEGER, not TEXT?

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Floating point type to store numbers

2007-04-22 Thread Lew

Andrew Sullivan wrote:

On Tue, Apr 17, 2007 at 04:33:33PM -0400, Radhika Sambamurti wrote:

Andrew,
This has been quite helpful. My main concern is CPU cost.  Thanks for the
input.


You're welcome.  Are you sure your main concern should be CPU cost?
It's true that numeric is more costly that float in a lot of cases,
but I know at least one auditor who will refuse to certify results
from programs that anywhere use floating-point storage or calculation
on accounting data.  The problem is really that you can get compound
errors -- very small rounding errors several times can turn out to be
a big problem.  (One quick primer that can help you understand this
is at <http://www2.hursley.ibm.com/decimal/decifaq1.html>.)


Would you rather have wrong answers really, really fast, or right answers fast 
enough?


--
Lew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [SQL] hi

2007-05-19 Thread Lew

John Summerfield wrote:

Penchalaiah P. wrote:

Information transmitted by this e-mail is proprietary to Infinite 
Computer Solutions 


It may be proprietary, but it shore ain't confidential!


Placing "confidential" on every document without regard for its content, 
especially when some of it's publicly disseminated, can remove the protection 
of confidentiality at law from all such marked documents in many 
jurisdictions, including the U.S.  There must be discrimination applied in the 
marking of information as "confidential".


Quite aside from the foolishness you pointed out of marking something 
"confidential" then placing it into the public eye in an archived forum where 
it will be visible by everybody forever.


Now we can publish everything ever written at or by Infinite Computer 
Solutions without fear of liability.


--
Lew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [SQL] Race condition in resetting a sequence

2007-08-09 Thread Lew

Steve Midgley writes:
The code I provided to reset a primary key sequence is actually part of 
Ruby on Rails core library - actually they use something very similar 
to what I originally sent:

...
SELECT setval('#{sequence}', (SELECT COALESCE(MAX(#{pk})+(SELECT 
increment_by FROM #{sequence}), (SELECT min_value FROM #{sequence})) 
FROM #{table}), false)


Tom Lane wrote:

Ugh.  That's completely unsafe/broken, unless they also use locking that
you didn't show.

...

It doesn't have a race condition "all by itself": it will do what it's
told.  The problem with commands such as the above is that there's a
time window between calculating the max() and executing the setval(),
and that window is more than large enough to allow someone else to
insert a row that invalidates your max() computation.  (Because of MVCC
snapshotting, the risk window is in fact as long as the entire
calculation of the max --- it's not just a few instructions as some
might naively think.)

Now it is possible to make this brute-force approach safe: you can lock
the table against all other modifications until you've applied your own
changes.  But you pay a high price in loss of concurrency if you do
that.


All this trouble over semantically-significant ID columns seems to support the 
camp that excoriates use of artificial ID columns and autoincrementation 
altogether.


The usual argument in their favor is that they speed up performance, but this 
epicyclic dance to accomodate FK references to autoincremented keys makes the 
case that there is also a performance penalty, and in the more critical 
performance area of code development and correctness than in the less critical 
search speed area.


--
Lew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq