Re: [GENERAL] Extensions and privileges in public schema

2016-12-06 Thread Lee Hachadoorian
On Sun, Dec 4, 2016 at 4:24 PM, Paul Ramsey 
wrote:

> When you create the student user, remove their create privs in public.
> Then create a scratch schema and grant them privs there.
> Finally, alter the student user so that the scratch schema appears FIRST
> in their search path. This will cause unqualified CREATE statements to
> create in the scratch schema.
> For full separation, give each student their own login and set the search
> path to
>
> "$user", public
>
> That way each student gets their own private scratch area, and it is used
> by default for their creates.
>
> P
>
>

Paul,

I've been avoiding giving each student an individual login role, but it
might be worth it to consider for a future term.

I've followed your (and Charles') advice to:

REVOKE CREATE ON SCHEMA public FROM public;
ALTER ROLE gus_faculty
  SET search_path = scratch,public,tiger;

It also occurred to me that I don't want anyone changing data in
spatial_ref_sys. I think I should revoke everything *except* SELECT and
REFERENCES, and make this the default for new objects created in public
schema:

ALTER DEFAULT PRIVILEGES IN SCHEMA scratch
REVOKE INSERT, UPDATE, DELETE, TRUNCATE, TRIGGER ON TABLES
FROM public;

Please let me know if this is inadvisable or violates accepted practice.

Best,
--Lee


[GENERAL] Extensions and privileges in public schema

2016-12-04 Thread Lee Hachadoorian
This question is specifically motivated by my use of the PostGIS extension,
but since other extensions create functions and other supporting objects in
public schema, I believe it is more general.

I'm teaching a university-level class using PostGIS. I have created a
scratch schema for students to create objects in. At the end of the term I
can drop scratch and start fresh the following term.

Students of course can also create objects in public schema, and often do
unintentionally because the forget to schema qualify their CREATE TABLE
statements. This complicates things because I can't drop public schema
without dropping various PostGIS (and other) tables and functions.
Additionally, while I doubt the students would do something like drop a
public function or supporting table (like spatial_ref_sys), it nonetheless
seems like a poor idea for these database objects to be vulnerable.

What is considered best practices in this case? Should PostGIS extension be
kept in its own schema (as was suggested when I asked about this on GIS.SE)?
If I do so, can I treat public schema the way I have been using scratch
schema, i.e. could I drop and recreate clean public schema at end of term?
Should I leave extensions in public but limit rights of public role in that
schema (so that they don't unintentionally create tables there, or
accidentally delete other objects)? Or do Postgres DBA's just not worry
about the objects in public schema, and rely upon applications and login
roles to interact with the database intelligently?

To be clear, primary goal is to keep student created objects in one schema
which can be dropped at the end of the term. But the question of preventing
accidental creation/deletion of objects in public schema is possibly
related, and the overall database organization might address both concerns.

Best,
--Lee


-- 
Lee Hachadoorian
Assistant Professor of Instruction, Geography and Urban Studies
Assistant Director, Professional Science Master's in GIS
Temple University


Re: [GENERAL] Replication Recommendation

2016-09-12 Thread Lee Hachadoorian
On Mon, Sep 12, 2016 at 5:12 PM, Adrian Klaver
<adrian.kla...@aklaver.com> wrote:
> On 09/12/2016 12:46 PM, Lee Hachadoorian wrote:
>>
>> There are a wide variety of Postgres replication solutions, and I
>> would like advice on which one would be appropriate to my use case.
>>
>> * Small (~half dozen) distributed workforce using a file sharing
>> service, but without access to direct network connection over the
>> internet
>> * Database is updated infrequently, when new government agency data
>> releases replace old data
>> * Because database is updated infrequently, workforce can come
>> together for LAN-based replication as needed
>> * Entire database is on the order of a few GB
>>
>> Given this, I am considering the super lowtech "replication" solution
>> of updating "master" and doing a full database drop and restore on the
>> "slaves". But I would like to know which of the other (real)
>> replication solutions might work for this use case.
>
>
> If I follow correctly the layout is?:
>
> Main database <--- Govt. data
> |
> |
>\ /
>
>File share
> |
> |
>\ /
>
> DB   DBDB   DB   DBDB
>
> User 1   User 2User 3   User 4   User 5User 6
>
>
>
> For your simple scenario you might want to look at:
>
> https://www.postgresql.org/docs/9.5/static/app-pgbasebackup.html
>

That diagram is what I am proposing.

pg_basebackup looks interesting. My initial impression is that the
main gain would be for a multiple database cluster. Are there other
advantages to using this in preference to a full DB dump and restore
if all of our data will be in a single database?

Best,
--Lee


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication Recommendation

2016-09-12 Thread Lee Hachadoorian
There are a wide variety of Postgres replication solutions, and I
would like advice on which one would be appropriate to my use case.

* Small (~half dozen) distributed workforce using a file sharing
service, but without access to direct network connection over the
internet
* Database is updated infrequently, when new government agency data
releases replace old data
* Because database is updated infrequently, workforce can come
together for LAN-based replication as needed
* Entire database is on the order of a few GB

Given this, I am considering the super lowtech "replication" solution
of updating "master" and doing a full database drop and restore on the
"slaves". But I would like to know which of the other (real)
replication solutions might work for this use case.

Regards,
--Lee

-- 
Lee Hachadoorian
Assistant Professor of Instruction, Geography and Urban Studies
Assistant Director, Professional Science Master's in GIS
Temple University


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres case insensitive searches

2013-06-29 Thread Lee Hachadoorian

  
  
On 06/29/2013 09:02 AM, bhanu udaya wrote:

  
  Hello,
I agree that it is just search condition. But, in a2.5 million
record table search, upper function is not that fast. The
expectation is to get the query retrieved in 100 ms...with all
indexes used.

I tried with upper, Citext, but the result set was more than a
second.

The OS server we are using is Linux 64 bit.

Thanks and Regards
Radha Krishna

 Subject: Re: [pgadmin-support] [GENERAL] Postgres case
  insensitive searches
   From: haram...@gmail.com
   Date: Sat, 29 Jun 2013 09:37:51 +0200
   CC: laurenz.a...@wien.gv.at;
  pgsql-general@postgresql.org; pgadmin-supp...@postgresql.org
   To: udayabhanu1...@hotmail.com
   
   On Jun 29, 2013, at 3:59, bhanu udaya
  udayabhanu1...@hotmail.com wrote:
   
Thanks. But, I do not want to convert into upper and
  show the result. 
   
   Why not? It won't modify your results, just the search
  condition:
   
   SELECT id, val FROM t WHERE upper(val) LIKE 'AB%' ORDER
  BY val;
   
   Or:
   
   SELECT id, val FROM t WHERE upper(val) LIKE 'AB%' ORDER
  BY upper(val), val;
   
   
Example, if I have records as below:
id type
1. abcd
2. Abcdef
3. ABcdefg
4. aaadf

The below query should report all the above 

select * from table where type like 'ab%'. It should
  get all above 3 records. Is there a way the database itself
  can be made case-insensitive with UTF8 characterset. I tried
  with character type  collation POSIX, but it did not
  really help.
   
   I was under the impression this would work, but ISTR that
  not every OS has this capability (Postgres makes use of the OS
  collation mechanics). So, what OS are you running the server
  on?
   

  


Duplicate the column with an upper or lowercase version and run all
queries against that.

CREATE TABLE foo (
 id serial PRIMARY KEY,
 val text,
 val_lower text
);

Index val_lower. Use triggers to keep val and val_lower in sync and
discard all attempts to write directly to val_lower. Then all
queries would be of the form

SELECT id, val
FROM foo
WHERE val_lower LIKE 'ab%';

Wouldn't want to write every table like this, but if (a) query speed
trumps all other requirements and (b) functional index, CITEXT, etc.
have all been rejected as not fast enough

--Lee


-- 
Lee Hachadoorian
Assistant Professor in Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu

  




[GENERAL] SQL keywords are suddenly case sensitive

2013-04-16 Thread Lee Hachadoorian

List,

SQL seems to be behaving in a case-sensitive manner:

universe=# select 1;
 ?column?
--
1
(1 row)

universe=# SELECT 1;
ERROR:  syntax error at or near SELECT 1
LINE 1: SELECT 1;
^
I cannot figure out how this happened, and Google is not helping because 
all I'm getting is information about case-sensitivity in identifiers or 
in string comparison. I didn't even think this was possible, as I've 
always switched between lower case and upper case keywords, usually 
using lower case while testing and upper case to prettify scripts that I 
will be saving for reuse.


version() = PostgreSQL 9.1.8 on x86_64-unknown-linux-gnu, compiled by 
gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit


Regards,
--Lee

--
Lee Hachadoorian
Assistant Professor in Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SQL keywords are suddenly case sensitive

2013-04-16 Thread Lee Hachadoorian


On 04/16/2013 07:31 PM, Adrian Klaver wrote:

On 04/16/2013 02:46 PM, Lee Hachadoorian wrote:

List,

SQL seems to be behaving in a case-sensitive manner:

universe=# select 1;
  ?column?
--
 1
(1 row)

universe=# SELECT 1;
ERROR:  syntax error at or near SELECT 1
LINE 1: SELECT 1;
 ^
I cannot figure out how this happened, and Google is not helping because
all I'm getting is information about case-sensitivity in identifiers or
in string comparison. I didn't even think this was possible, as I've
always switched between lower case and upper case keywords, usually
using lower case while testing and upper case to prettify scripts that I
will be saving for reuse.


Interesting.

Does this behavior survive logging out and then back into a session?


It survives complete restart. (This is a laptop that I use for 
development and analysis, not a high-availability server, so the first 
thing I did when I realized my scripts started failing was reboot.)




Do you have any other client using the database that exhibits this 
behavior?


Same behavior in both psql and pgAdmin.

Regards,
--Lee

--
Lee Hachadoorian
Assistant Professor in Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SQL keywords are suddenly case sensitive

2013-04-16 Thread Lee Hachadoorian


On 04/16/2013 07:34 PM, Adrian Klaver wrote:

On 04/16/2013 02:46 PM, Lee Hachadoorian wrote:

List,

SQL seems to be behaving in a case-sensitive manner:

universe=# select 1;
  ?column?
--
 1
(1 row)

universe=# SELECT 1;
ERROR:  syntax error at or near SELECT 1
LINE 1: SELECT 1;
 ^
I cannot figure out how this happened, and Google is not helping because
all I'm getting is information about case-sensitivity in identifiers or
in string comparison. I didn't even think this was possible, as I've
always switched between lower case and upper case keywords, usually
using lower case while testing and upper case to prettify scripts that I
will be saving for reuse.


Another question:

Are the psql and Postgres versions the same?


Appears to both be 9.1.8.

lee@tycho ~ $ psql -d universe
psql (9.1.8)
Type help for help.

universe=# select version();

version
--
PostgreSQL 9.1.8 on x86_64-unknown-linux-gnu, compiled by gcc 
(Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit


Regards,
--Lee

--
Lee Hachadoorian
Assistant Professor in Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SQL keywords are suddenly case sensitive

2013-04-16 Thread Lee Hachadoorian


On 04/16/2013 08:23 PM, Tom Lane wrote:

SELECT 1 as all one token.  Also, if you transcribed this accurately,
it looks like the error cursor is pointing to the second character of
the SELECT not the first (could you confirm that?).  Which is even
more bizarre.


No, that must have been an email formatting thing. In psql, the caret is 
under the S.


Looking at the other issues you raised, but just wanted to provide a 
quick answer to that.


Best,
--Lee

--
Lee Hachadoorian
Assistant Professor in Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SQL keywords are suddenly case sensitive

2013-04-16 Thread Lee Hachadoorian


On 04/16/2013 08:23 PM, Tom Lane wrote:

Lee Hachadoorian lee.hachadooria...@gmail.com writes:

SQL seems to be behaving in a case-sensitive manner:
universe=# select 1;
   ?column?
--
  1
(1 row)
universe=# SELECT 1;
ERROR:  syntax error at or near SELECT 1
LINE 1: SELECT 1;
  ^

That's really bizarre, but I don't think it's a case sensitivity problem
as such.  Watch what I get from a syntax error on a normally-functioning
system:

$ psql
psql (9.1.9)
Type help for help.

regression=# select 1;
  ?column?
--
 1
(1 row)

regression=# SELECT 1;
  ?column?
--
 1
(1 row)

regression=# xELECT 1;
ERROR:  syntax error at or near xELECT
LINE 1: xELECT 1;
 ^

See the differences?  The error message indicates that your parser saw
SELECT 1 as all one token.  Also, if you transcribed this accurately,
it looks like the error cursor is pointing to the second character of
the SELECT not the first (could you confirm that?).  Which is even
more bizarre.  I'm not sure what's going on, but I think it's more
likely to be something to do with whitespace not being really whitespace
than it is with case as such.  Consider the possibility that you're
somehow typing a non-breaking space or some such character.  One thing
that might be useful is to examine the error report in the postmaster
log using an editor that will show you any non-printing characters.

regards, tom lane


Tom,

Yes, nonbreaking spaces was the problem. Sorry for the red herring re: 
case sensitivity, and thanks for figuring it out. Adrian, thanks for 
your input as well.


Best,
--Lee

--
Lee Hachadoorian
Assistant Professor in Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Current Schema for Functions called within other Functions

2013-04-01 Thread Lee Hachadoorian
Merlin,

Perfect. Thank you.

Best,
--Lee


On Mon, Apr 1, 2013 at 10:04 AM, Merlin Moncure mmonc...@gmail.com wrote:

 On Sun, Mar 31, 2013 at 5:32 PM, Lee Hachadoorian
 lee.hachadooria...@gmail.com wrote:
  I'm working on some PL/pgSQL functions to generate dynamic SQL. The
  functions live in the public schema, but the SQL generated will vary
  depending on what schema they are called from. Something is going on
 which I
  cannot figure out. I am defining variables by creating a getter
 function
  within each schema. This getter function is then called by the dynamic
 SQL
  function. But this works once, and then the value seems to persist.
 
  ```SQL
  CREATE SCHEMA var1;
  CREATE SCHEMA var2;
 
  SET search_path = public;
 
  /*This function generates dynamic SQL, here I have it just returning a
  string
  with the current schema and the value from the getter function.*/
  DROP FUNCTION IF EXISTS sql_dynamic();
  CREATE FUNCTION sql_dynamic() RETURNS text AS $function$
  DECLARE
  sql TEXT := '';
  BEGIN
  sql := current_schema() || ',' || get_var();
  RETURN sql;
  END;
  $function$ LANGUAGE plpgsql;
 
  SET search_path = var1, public;
 
  SELECT get_var(); --Fails
  SELECT sql_dynamic(); --Fails
 
  DROP FUNCTION IF EXISTS get_var();
  CREATE FUNCTION get_var() RETURNS text AS $get_var$
  BEGIN
  RETURN 'var1';
  END;
  $get_var$ LANGUAGE plpgsql;
 
  SELECT get_var();
  SELECT sql_dynamic();
 
  SET search_path = var2, public;
 
  SELECT get_var(); --Fails
  SELECT sql_dynamic(); --Returns current_schema and *old* get_var() value
  from wrong schema!
 
  DROP FUNCTION IF EXISTS get_var();
  CREATE FUNCTION get_var() RETURNS text AS $get_var$
  BEGIN
  RETURN 'var2';
  END;
  $get_var$ LANGUAGE plpgsql;
 
  SELECT get_var(); --Succeeds
  SELECT sql_dynamic(); --Still returns *old* get_var() value from wrong
  schema!
 
  ```
 
  At this point, if I DROP/CREATE sql_dyanamic() in public, get_var() run
 in
  the var2 schema works, but if I change the search_path back to var1,
  sql_dynamic() returns var1,var2.
 
  I also tried using a table to store the variable. I created a table var
  (with one field also named var) in each schema, then altered
 sql_dynamic()
  to return current_schema() and the value of var.var (unqualified, so that
  expected when search_path includes var1 it would return var1.var.var),
 but I
  ran into the same persistence problem. Once initialized in one schema,
  changing search_path to the other schema returns the correct
 current_schema
  but the value from the table in the *other* schema (e.g. var2,var1).
 
  What am I missing?

 in plpgsql, all functions and tables that are not schema qualified
 become schema qualified when the function is invoked and planned the
 first time.  The line:
 ql := current_schema() || ',' || get_var();

 attaches a silent var1.  to get_var() so it will forever be stuck for
 that connection.  The solution is to use EXECUTE.

 merlin




-- 
Lee Hachadoorian
Asst Professor of Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu/


[GENERAL] Current Schema for Functions called within other Functions

2013-03-31 Thread Lee Hachadoorian
I'm working on some PL/pgSQL functions to generate dynamic SQL. The
functions live in the public schema, but the SQL generated will vary
depending on what schema they are called from. Something is going on which
I cannot figure out. I am defining variables by creating a getter
function within each schema. This getter function is then called by the
dynamic SQL function. But this works once, and then the value seems to
persist.

```SQL
CREATE SCHEMA var1;
CREATE SCHEMA var2;

SET search_path = public;

/*This function generates dynamic SQL, here I have it just returning a
string
with the current schema and the value from the getter function.*/
DROP FUNCTION IF EXISTS sql_dynamic();
CREATE FUNCTION sql_dynamic() RETURNS text AS $function$
DECLARE
sql TEXT := '';
BEGIN
sql := current_schema() || ',' || get_var();
RETURN sql;
END;
$function$ LANGUAGE plpgsql;

SET search_path = var1, public;

SELECT get_var(); --Fails
SELECT sql_dynamic(); --Fails

DROP FUNCTION IF EXISTS get_var();
CREATE FUNCTION get_var() RETURNS text AS $get_var$
BEGIN
RETURN 'var1';
END;
$get_var$ LANGUAGE plpgsql;

SELECT get_var();
SELECT sql_dynamic();

SET search_path = var2, public;

SELECT get_var(); --Fails
SELECT sql_dynamic(); --Returns current_schema and *old* get_var() value
from wrong schema!

DROP FUNCTION IF EXISTS get_var();
CREATE FUNCTION get_var() RETURNS text AS $get_var$
BEGIN
RETURN 'var2';
END;
$get_var$ LANGUAGE plpgsql;

SELECT get_var(); --Succeeds
SELECT sql_dynamic(); --Still returns *old* get_var() value from wrong
schema!

```

At this point, if I DROP/CREATE sql_dyanamic() in public, get_var() run in
the var2 schema works, but if I change the search_path back to var1,
sql_dynamic() returns var1,var2.

I also tried using a table to store the variable. I created a table var
(with one field also named var) in each schema, then altered sql_dynamic()
to return current_schema() and the value of var.var (unqualified, so that
expected when search_path includes var1 it would return var1.var.var), but
I ran into the same persistence problem. Once initialized in one schema,
changing search_path to the other schema returns the correct current_schema
but the value from the table in the *other* schema (e.g. var2,var1).

What am I missing?

Thanks,
--Lee

-- 
Lee Hachadoorian
Asst Professor of Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu/


[GENERAL] Check table storage parameters

2012-11-16 Thread Lee Hachadoorian
How can I read the current storage parameters for an existing table?
Specifically interested in autovacuum_enabled.

Sorry to ask such basic question, but I can't find this in the docs,
and every search I've tried ends up taking me to how to *set* the
parameter with CREATE TABLE and ALTER TABLE.

Thanks,
--Lee

-- 
Lee Hachadoorian
Asst Professor of Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu/


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Check table storage parameters

2012-11-16 Thread Lee Hachadoorian
 On Fri, Nov 16, 2012 at 10:11 AM, Lee Hachadoorian
 lee.hachadooria...@gmail.com wrote:

 How can I read the current storage parameters for an existing table?
 Specifically interested in autovacuum_enabled.

On Fri, Nov 16, 2012 at 12:08 PM, Mike Blackwell mike.blackw...@rrd.com wrote:
 Try pg_class.reloptions?

That was it.

Interestingly, if the table uses the server default, reloptions is NULL.

Best,
--Lee

-- 
Lee Hachadoorian
Asst Professor of Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu/


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_wrapper error

2012-10-26 Thread Lee Hachadoorian
On Fri, Oct 26, 2012 at 9:37 AM, Kevin Grittner kgri...@mail.com wrote:
 José Pedro Santos wrote:

 When I try to use the command line tool shp2pgsql in the shell I
 have this error:

 Error: pg_wrapper: invalid command name

 I already read some information in the Debian lists but I don't
 understand the problem/relation with PostgreSQL.

 shp2pgsql converts ESRI Shape files into SQL suitable for insertion
 into a PostGIS/PostgreSQL database. You are not using PostgreSQL
 until you take the resulting file and run it against the database
 using some client application, like psql. pg_wrapper is an Ubuntu
 helper program that runs a PostgreSQL client program like psql
 after looking up which version to use for the requested database
 cluster.

 http://manpages.ubuntu.com/manpages/lucid/man1/shp2pgsql.1.html

 http://manpages.ubuntu.com/manpages/lucid/man1/pg_wrapper.1.html

 It appears that you don't have pg_wrapper installed, but are trying
 to use it (either directly or by running something which tries to use
 it).

 -Kevin

No, pg_wrapper is installed. If it weren't, you would get

pg_wrapper: command not found

The issue is that it seems raster2pgsql and shp2pgsql are not
installed. You cannnot enable them by making them each a link to
pg_wrapper.

You might want to bring this back to the PostGIS list, because I think
this is a problem of an incorrect PostGIS installation.

--Lee

-- 
Lee Hachadoorian
Asst Professor of Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu/


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] dblink.sql and Linux

2012-05-15 Thread Lee Hachadoorian

  
  

On 05/14/2012 09:19 PM, Mike Christensen wrote:

  I just installed Ubuntu 12.04 today.  Postgres was not listed in the
Ubuntu Software Center, so I downloaded the apt installer from:

http://www.openscg.com/se/oscg_home_download.jsp

Mike

On Mon, May 14, 2012 at 6:10 PM, Devrim GÜNDÜZ dev...@gunduz.org wrote:

  
Hi,

On Mon, 2012-05-14 at 18:05 -0700, Mike Christensen wrote:


  I've become a big fan of DBLink lately, but I'm curious where it lives
on Linux installations.



Which Linux? Which package/installer?

It mostly ships with the -contrib package.


  

Full list of Ubuntu packages is available at packages.ubuntu.com.
Postgres 9.1 (default) and 8.4 are in the repository. The Ubuntu
Software Center hides stuff from you. If you search for postgres,
you won't find it unless you click a not-very-obvious "Show [x]
technical items" link at the bottom of the window. I would recommend
using apt (command line) or Synaptic (graphical) instead of Software
Center.
sudo apt-get install postgresql postgresql-contrib

should get you started.

--Lee
    -- 
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu

  



[GENERAL] Multiple COPY statements

2012-05-10 Thread Lee Hachadoorian
Does anyone have experience or advice on how to efficiently issue a
large number of COPY statements? The data (US Census) comes in  100
segments (each will be copied to its own database tables) for each
state (51), for a total of  5000 text files. I can generate the COPY
statements with a script.

The two specific question I can think of (but I'm sure there's more
that I'm not thinking of) are:

1) COPY is fastest when used within the same transaction as an
earlier CREATE TABLE or TRUNCATE command. In such cases no WAL needs
to be written, because in case of an error, the files containing the
newly loaded data will be removed anyway. Would I be able to take
advantage of this if I:

BEGIN;
TRUNCATE import_table;
COPY import_table FROM 'file1';
COPY import_table FROM 'file2';
...
COPY import_table FROM 'file51';
END;

2) Is there a performance hit to doing a COPY to more than one table
in the same transaction?

Any other advice will be appreciated.

Regards,
--Lee

-- 
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Multiple COPY statements

2012-05-10 Thread Lee Hachadoorian
On Thu, May 10, 2012 at 2:42 PM, Andy Colson a...@squeakycode.net wrote:
 On 5/10/2012 1:10 PM, Lee Hachadoorian wrote:

 2) Is there a performance hit to doing a COPY to more than one table
 in the same transaction?


 No, I don't think so.  I assume you are the only user hitting the
 import_table, so holding one big transaction wont hurt anything.

Actually what I mean is that there are multiple import tables,
import_table1 ... import_table100. But it is true that I would be the
only user hitting the import tables.

 Any other advice will be appreciated.


 To really speed it up, you'd need to run multiple concurrent connections
 each doing COPY's.  Maybe up to the number of cores you have.  (of course
 you dont want each connection to fire off truncates, but concurrent should
 trump skip wall in terms of speed).

 If import_table is just a temp holding stot you can look into temp and/or
 unlogged tables.

Yes, it is a staging table, data needs to be manipulated before
shunting to its desired destination. I think unlogged tables will be
helpful, and if I understand correctly then I wouldn't need to use the
BEGIN; TRUNCATE; COPY...; END; trick. And would unlogged + concurrent
connections work together?

--Lee

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] huge price database question..

2012-03-21 Thread Lee Hachadoorian
On Tue, Mar 20, 2012 at 11:28 PM, Jim Green
student.northwest...@gmail.comwrote:

 On 20 March 2012 22:57, John R Pierce pie...@hogranch.com wrote:

  avg() in the database is going to be a lot faster than copying the data
 into
  memory for an application to process.

 I see..


As an example, I ran average on a 700,000 row table with 231 census
variables reported by state. Running average on all 231 columns grouping by
state inside Postgres beat running it by R by a factor of 130 NOT COUNTING
an additional minute or so to pull the table from Postgres to R. To be
fair, these numbers are not strictly comparable, because it's running on
different hardware. But the setup is not atypical: Postgres is running on a
heavy hitting server while R is running on my desktop.

SELECT state, avg(col1), avg(col2), [...] avg(col231)
FROM some_table
GROUP BY state;

5741 ms

aggregate(dfSomeTable, by = list(dfSomeTable$state), FUN = mean, na.rm =
TRUE)

754746 ms

--Lee


Re: [GENERAL] huge price database question..

2012-03-21 Thread Lee Hachadoorian
On Wed, Mar 21, 2012 at 12:57 PM, Andy Colson a...@squeakycode.net wrote:

 On 3/21/2012 11:45 AM, Lee Hachadoorian wrote:



On 20 March 2012 22:57, John R Pierce pie...@hogranch.com
mailto:pie...@hogranch.com wrote:

  avg() in the database is going to be a lot faster than copying
the data into
  memory for an application to process.


 As an example, I ran average on a 700,000 row table with 231 census
 variables reported by state. Running average on all 231 columns grouping
 by state inside Postgres beat running it by R by a factor of 130 NOT
 COUNTING an additional minute or so to pull the table from Postgres to
 R. To be fair, these numbers are not strictly comparable, because it's
 running on different hardware. But the setup is not atypical: Postgres
 is running on a heavy hitting server while R is running on my desktop.

 SELECT state, avg(col1), avg(col2), [...] avg(col231)
 FROM some_table
 GROUP BY state;

 5741 ms

 aggregate(dfSomeTable, by = list(dfSomeTable$state), FUN = mean, na.rm =
 TRUE)

 754746 ms

 --Lee


 avg() might not be a good example though.  If you just want average,
 great, but what about when you want to do lots of different stats'y
 functions that PG does not have?  I'll bet R (not pl-R) can read a text
 file (a csv probably) pretty fast.


Don't get me wrong. I DO use R for all kinds of stats and graphics stuff. I
just tested avg() right now to test the claim made above.

I did some garbage collection and reran the aggregate() on half the data
and this time it only took 80 seconds. Much faster, though still an order
of magnitude slower than running it inside Postgres. The 700,000 row data
frame in an an unclean R session swamped the 6 GB RAM on my desktop.
Maybe the original test was unfair. On the other hand, it does directly
speak to John's comment that avg() in the database is going to be a lot
faster than copying the data into memory for an application to process.

If you were doing a really complex analysis I bet dumping a dataset out to
 csv first and then running R scripts over it would be fast and useful.


Interesting:

COPY statement (same table, 700,000 rows): 7761 ms
SFTP file copy from Postgres server to local machine: 21 seconds (didn't
time it and my FTP client didn't report it)
read.csv (R command): 62 seconds

Total ~ 91 seconds

R commands:
library(RPostgreSQL)
declare connection object
dfSomeTable = dbGetQuery(conn, SELECT * FROM some_table)

~ 85 seconds

So whether you export to CSV first or use RPostgreSQL, it's roughly
equivalent, though I would tend to favor RPostgreSQL just because it's one
step instead of three.

On the other hand:

sql = SELECT state, avg(col1), avg(col2), [...] avg(col231) FROM
some_table GROUP BY state;
dfGrouped = dbGetQuery(conn, sql)

~ 5 seconds

So aggregating inside Posgres, then the smaller data set via RPostgreSQL
wins on two counts, speed of aggregation and speed of transfer.




  aggregate(dfSomeTable, by = list(dfSomeTable$state), FUN = mean, na.rm =
  TRUE)

 Is this a Postgres stored proc (pl-R)?  Or is that R itself?  If its plR
 then I wonder if its stepping through the recordset twice.  Depending on
 how the function is written, I can see the function firing off a query, PG
 collects the entire recordset, and hands it back to the function, which
 then iterates thru it again.  (vs. having the function called for each row
 as PG steps thru the recordset only once).


This is run in R. Data frame (dfSomeTable in this code snippet) is already
in R, so no trips to Postgres are taking up time in this process. The real
reason for the inflated time is pointed out above.

--Lee

-- 
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu/


Re: [GENERAL] DDL Triggers

2012-02-15 Thread Lee Hachadoorian
On Wed, Feb 15, 2012 at 12:42 PM, Bob Pawley rjpaw...@shaw.ca wrote:

 I have a table that is generated through ogr2ogr.

 To get ogr2ogr working the way I want, I need to use the -overwrite
 function. If I use the append function information is lost. Something to do
 with the way the switches work.

 Overwrite drops the existing table and also the attached trigger .

 I am attempting to have the dropped table trigger a function as it is
 being populated by insert commands.

 DDL trigger was mentioned as a possibility on the GDAL list.

 Any suggestions will be welcome.

 Bob


Are you sure ogr2ogr is the right tool for what you want to accomplish?
Reading between the lines, it *seems* like you intend to do an append (SQL
INSERT), but ogr2ogr isn't working the way you want, forcing you to use
overwrite (DROP/CREATE TABLE). Trying to use a DDL trigger seems like a
roundabout way to get what you want when the problem is ogr2ogr.

So, I would back up and ask, what are you trying to do, and what
information is being lost using -append?

Also, you mentioned asking this on the GDAL list, did you try the PostGIS
list?

--Lee

-- 
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu/


Re: [GENERAL] JOIN column maximum

2012-01-06 Thread Lee Hachadoorian
On Thu, Jan 5, 2012 at 10:19 PM, Darren Duncan dar...@darrenduncan.net
 wrote:


 This all being said, 23K values per row just sounds wrong, and I can't
 imagine any census forms having that many details.

 Do you, by chance, have multiple values of the same type that are in
 different fields, eg telephone_1, telephone_2 or child_1, child_2 etc?  You
 should take any of those and collect them into array-typed fields, or
 separate tables with just telephone or child columns.  Or do you say have a
 set of coordinates in separate fields?  Or you may have other kinds of
 redundancy within single rows that are best normalized into separate rows.

 With 23K values, these probably have many mutual associations, and you
 could split that table into a bunch of other ones where columns that relate
 more closely together are collected.

 My question is already answered, so this is mostly for anyone curious
about Census data, and if anyone wants to add to this, feel free.

You're right that no census form has that many details. The American
Community Survey replaced the old Census long form, so it does have more
details than the form sent out for the 2010 Census, but not 23,000
questions. It might ask, e.g. income, sex, and how you travel to work. But
the forms are private, so that data (the so-called microdata) is not
released in its entirety. What I am working with is called the summary
file, which presents the data in aggregate. That means you have an income
table with 16 income classes, plus a total column. Then you have 9 more
tables which show the same income classes by 9 racial and ethnic
categories, for a total of 153 more columns. Then you also have a table
which crosses 9 income classes by 5 mobility statuses (living in same
house, moved within county, moved within state, etc.) for a total of 55
columns. Then you have income classes crossed with sex, income classes
crossed with mode of transportation to work, sex crossed with mode of
transportation to work, etc. When all is said and done, you have 23,000
columns.

Believe me, I would all love to be working directly with the microdata. All
of these different ways of slicing and dicing the categories are basically
how the Census Bureau provides as much detail as possible without providing
so much detail that privacy would be compromised (i.e., enough information
is available that specific individuals could be identified). That
inevitably leads to a great deal of redundancy in the data, since the same
individuals are just being counted in different groups in different tables.

Given all that, one could still take the data that came from Census and try
to normalize it and organize it, but my main goal given the size of the
dataset is to keep it as similar as possible to the source, so that a
researcher familiar with the Census data can work with our database using
the Census documentation without our having to produce a new set of
documentation. The way I had done that was to store the sequences (which
are merely a data dissemination convenience, and have no relationship to
the logic of the data) in Postgres tables, and make the subject tables
(which do represent logical groupings of data) into views. I'm thinking
about making the sequences into array columns. The subject tables would
still be views.

--Lee

-- 
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu/


[GENERAL] JOIN column maximum

2012-01-05 Thread Lee Hachadoorian
How is the number of columns in a join determined? When I combine somewhere
around 90 tables in a JOIN, the query returns:

ERROR: joins can have at most 32767 columns
SQL state: 54000

I'm sure most people will say Why the hell are you joining 90 tables.
I've asked this list before for advice on how to work with the
approximately 23,000 column American Community Survey dataset, and based on
previous responses I am trying to combine 117 sequences (basically
vertical partitions of the dataset) into one table using array columns. Of
course, I can build this up by joining a few tables at a time, so the
question is mostly curiosity, but I haven't been able to find this
documented anywhere. Moreover, the 32767 limit doesn't map to any
immediately intuitive transformation of 90, like squaring (which is much
too low) or factorial (which is much to high).

Any insight?

Regards,
--Lee

-- 
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu/


Re: [GENERAL] JOIN column maximum

2012-01-05 Thread Lee Hachadoorian

On 01/05/2012 06:18 PM, Tom Lane wrote:



ERROR: joins can have at most 32767 columns

It's the sum of the number of columns in the base tables.

That makes sense. I totally misunderstood the message to be referring to 
the number of joined columns rather than table columns.



I've asked this list before for advice on how to work with the
approximately 23,000 column American Community Survey dataset,

Are there really 23000 populated values in each row?  I hesitate to
suggest an EAV approach, but it kinda seems like you need to go in that
direction.  You're never going to get decent performance out of a schema
that requires 100-way joins, even if you avoid bumping up against hard
limits.
Many of the smaller geographies, e.g. census tracts, do in fact have 
data for the vast majority of the columns. I am trying to combine it all 
into one table to avoid the slowness of multiple JOINs (even though in 
practice I'm never joining all the tables at once). EAV sounds correct 
in terms of normalization, but isn't it usually better performance-wise 
to store write-once/read-many data in a denormalized (i.e. flattened) 
fashion? One of these days I'll have to try to benchmark some different 
approaches, but for now planning on using array columns, with each 
sequence (in the Census sense, not the Postgres sense) of 200+ 
variables in its own array rather than its own table.


--Lee

--
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Significant Digits in Floating Point Datatype

2011-11-20 Thread Lee Hachadoorian
I would like to store some in a single array some data that is 
conceptually related, but some of the data is floating point, and some 
of it is integer. Obviously the integer data *can* be stored as double 
precision, but I need to know about potential loss of precision. Double 
precision has a precision of at least 15 digits. I would assume that 
that would mean that for double precision, 15 digits of data would be 
faithfully preserved. But I found a question on the list where a value 
stored as 955.60 sometimes returns as 955.5998. 
(http://archives.postgresql.org/pgsql-general/2011-08/msg00144.php) If 
this is the case, what does a precision of at least [x] digits 
actually mean? And can I reliably retrieve the original integer by 
casting to int (or bigint) if the number of digits in the original 
integer is less than 15?


Regards,
--Lee

--
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Significant Digits in Floating Point Datatype

2011-11-20 Thread Lee Hachadoorian

On 11/20/2011 02:06 PM, Tom Lane wrote:

Lee Hachadoorianlee.hachadoor...@gmail.com  writes:

And can I reliably retrieve the original integer by
casting to int (or bigint) if the number of digits in the original
integer is less than 15?

On IEEE-floating-point machines, I'd expect float8 to store integers
up to 2^52 (or maybe it's 2^53) exactly.  With other floating-point
formats the limit might be different, but it should still be exact for
reasonable-size integers.  This has nothing whatever to do with whether
decimal fractions are reproduced exactly (in general, they aren't, no
matter how many or few digits are involved).  So integers are fine,
bigints not so much.

regards, tom lane


Thank you, that clarification is extremely useful.  --Lee

--
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large Rows

2011-10-26 Thread Lee Hachadoorian



On 10/26/2011 12:31 AM, David Johnston wrote:

On Oct 25, 2011, at 22:17, Lee Hachadoorianlee.hachadoor...@gmail.com  wrote:


I need some advice on storing/retrieving data in large rows. Invariably someone 
points out that very long rows are probably poorly normalized, but I have to 
deal with how to store a dataset which cannot be changed, specifically the 
~23,000 column US Census American Community Survey.

The Census releases these data in 117 sequences of  256 columns (in order to 
be read by spreadsheet applications with a 256 column limit). I have previously stored each 
sequence in its own table, which is pretty straightforward.

My problem is that some of the demographic researchers I work with want a 
one-table dump of the entire dataset. This would primarily be for data 
transfer. This is of limited actual use in analysis, but nonetheless, that's 
what we want to be able to do.

Now, I can't join all the sequences in one SQL query for export because of the 
1600 column limit. So based on previous list activity (Tom Lane: Perhaps you 
could collapse multiple similar columns into an array column? 
http://archives.postgresql.org/pgsql-admin/2008-05/msg00211.php), I decided to 
try to combine all the sequences into one table using array columns. (This 
would actually make querying easier since the users wouldn't have to constantly 
JOIN the sequences in their queries.) Next problem: I run into the 8k row size 
limit once about half the columns are populated. As far as I can understand, 
even though a row theoretically supports a 1.6TB (!) row size, this only works 
for TOASTable data types (primarily text?). The vast majority of the 23k 
columns I'm storing are bigint.

Questions:

1) Is there any way to solve problem 1, which is to export the 23k columns from 
the database as it is, with 117 linked tables?
2) Is there any way to store the data all in one row? If numeric types are 
un-TOASTable, 23k columns will necessarily break the 8k limit even if they were 
all smallint, correct?

Regards,
--Lee

--
Lee Hachadoorian
PhD, Earth   Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

You can brute-force a 23k column CSV output file using a programming language 
but if you need to keep it in a database the fact we are talking about being 
over the numeric column limit by a factor of twenty means you are basically SOL 
with PostgreSQL.

Even if such a table were possible how it, in it's entirety, would be useful is 
beyond me.
It's not, as no one would ever analyze all the variables at once. Doing 
this with a programming language is probably the way to go. But am I 
correct that using arrays to reduce the number of columns won't work 
because numeric data types aren't TOASTable?

There are few things that cannot be changed, and this requirement is unlikely 
to be one of those things.  Your problems are more political than technical and 
those are hard to provide advice for in an e-mail.

If you need technical solutions there may be another tool out there that can 
get you what you want but stock PostgreSQL isn't going to cut it.

Not having any idea what those 23k columns are doesn't help either; the census 
questionnaire isn't that big...
The vast majority of the columns represent population counts. Sometimes 
it might represent a dollar amount (income or contract rent, for 
example). While a sample of individual questionnaires is released (the 
microdata), this question concerns the summary files, where the 
individual answers are categorized/bucketed and aggregated by various 
geographies. So a cell might represent number of people in a county 
(row) who commuted to work by bicycle (column). The number of rows grows 
when various categories are crossed with each other. Table B08519 - 
MEANS OF TRANSPORTATION TO WORK BY WORKERS' EARNINGS IN THE PAST 12 
MONTHS contains 6 transportation modes crossed by 8 income classes, for 
63 columns once subtotals are added. The complete list of variables is 
available at 
http://www2.census.gov/acs2009_5yr/summaryfile/Sequence_Number_and_Table_Number_Lookup.xls.



Instead of giving them what they think they want talk to them and then try to 
provide them what they actually need given the limitations of your current 
toolset, or resolve to find a more suitable tool if the needs are valid but 
cannot be met with the existing tools.

David J.

Regards,
--Lee

--
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large Rows

2011-10-26 Thread Lee Hachadoorian
On Wed, Oct 26, 2011 at 10:15 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On Wed, Oct 26, 2011 at 2:57 PM, Lee Hachadoorian
 lee.hachadoor...@gmail.com wrote:

  Interesting. Although your example of one, 10-dimension array works,
  five hundred 2-dimension arrays does not work. I can do the SELECT, but
 the
  CREATE TABLE fails:
 
  ERROR: row is too big: size 9024, maximum size 8160
  SQL state: 54000
 
  David has already hit the nail on the head in terms of this being a
  political problem rather than a technology problem. I'm open to ideas,
 but
  I realize there might be no other answer than No one in their right mind
  should do this.

 No, this is a technology problem.

 Toast pointers are 20 bytes per column, so with 500 columns that is
 1 bytes - which will not fit in one block.

 If you wish to fit this in then you should use a 2 dimensional array,
 which will then be just 1 column and your data will fit.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


Very useful to know. Thank you.

--Lee

-- 
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu/


[GENERAL] Large Rows

2011-10-25 Thread Lee Hachadoorian
I need some advice on storing/retrieving data in large rows. Invariably 
someone points out that very long rows are probably poorly normalized, 
but I have to deal with how to store a dataset which cannot be changed, 
specifically the ~23,000 column US Census American Community Survey.


The Census releases these data in 117 sequences of  256 columns (in 
order to be read by spreadsheet applications with a 256 column limit). I 
have previously stored each sequence in its own table, which is pretty 
straightforward.


My problem is that some of the demographic researchers I work with want 
a one-table dump of the entire dataset. This would primarily be for data 
transfer. This is of limited actual use in analysis, but nonetheless, 
that's what we want to be able to do.


Now, I can't join all the sequences in one SQL query for export because 
of the 1600 column limit. So based on previous list activity (Tom Lane: 
Perhaps you could collapse multiple similar columns into an array 
column? 
http://archives.postgresql.org/pgsql-admin/2008-05/msg00211.php), I 
decided to try to combine all the sequences into one table using array 
columns. (This would actually make querying easier since the users 
wouldn't have to constantly JOIN the sequences in their queries.) Next 
problem: I run into the 8k row size limit once about half the columns 
are populated. As far as I can understand, even though a row 
theoretically supports a 1.6TB (!) row size, this only works for 
TOASTable data types (primarily text?). The vast majority of the 23k 
columns I'm storing are bigint.


Questions:

1) Is there any way to solve problem 1, which is to export the 23k 
columns from the database as it is, with 117 linked tables?
2) Is there any way to store the data all in one row? If numeric types 
are un-TOASTable, 23k columns will necessarily break the 8k limit even 
if they were all smallint, correct?


Regards,
--Lee

--
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] The first dedicated PostgreSQL forum

2010-11-15 Thread Lee Hachadoorian
If anyone's interested, I've started accessing the postgres list through
gmane.org (along with several other mailing lists I subscribe to). It's
gives you the choice of reading the list as a threaded archive, a blog,
or through an NNTP newsreader or an RSS feed. Everyone chooses their
preferred interface, the community is not fractured by interface preference.

--Lee

On 11/15/2010 03:13 PM, Joshua D. Drake wrote:
 On Mon, 2010-11-15 at 21:06 +0100, Magnus Hagander wrote:
 
 I do think that the PostgreSQL lists are available there. And other
 than that, they are on markmail.org, Nabble, etc. AFAIK several of
 those allow both reading and posting. Is there actually something
 about these interfaces that people find *missing*, or can we easily
 solve this whole thing by more clearly telling people that these
 options exist?
 
 Mainly I think it is the user interface and the fact that they are
 external. They don't look, act or feel like forums. Shrug. Further they
 aren't part of postgresql.org so nobody knows the level of real support
 they are going to get.
 
 JD
 
 

-- 
Lee Hachadoorian
PhD Student, Geography
Program in Earth  Environmental Sciences
CUNY Graduate Center

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Connection timeouts from pgAdmin

2010-03-10 Thread Lee Hachadoorian
John,

Just wanted to reply that this seems to have been the right track.
Rather than change the firewall settings, our network administrator
was able set postgres to send a keepalive to the client.

Thanks,
--Lee

On Thu, Mar 4, 2010 at 5:26 PM, Lee Hachadoorian
lee.hachadoor...@gmail.com wrote:
 On Thu, Mar 4, 2010 at 5:01 PM, John R Pierce pie...@hogranch.com wrote:
 are you running pgadmin and postgres server on the same computer, or on
 different computers?

 Different computers.

 if different computers, is there any sort of connection tracking in between,
 such as a NAT router/gateway?

 15-20 mins sounds a lot like the typical NAT idle connection timeout...

 I will have to ask the network administrator and respond.

 Thanks,
 --Lee

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Connection timeouts from pgAdmin

2010-03-04 Thread Lee Hachadoorian
I use the SQL editor in pgAdmin for all of my database work. The
problem I'm having is that after a period of inactivity, pgAdmin loses
the connection to the server. I'm trying to figure out how to avoid
this happening. Not being a network administrator, I'm wondering if
this is related to the tcp_keepalives settings described at
http://www.postgresql.org/docs/8.4/static/runtime-config-connection.html.
Can someone confirm that I'm looking at the right settings, or is
there something else that might cause clients to lose the connection?
The connection seems to be being lost after ~ 15-20 minutes of
inactivity.

Thanks,
--Lee

-- 
Lee Hachadoorian
PhD Student, Geography
Program in Earth  Environmental Sciences
CUNY Graduate Center

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Connection timeouts from pgAdmin

2010-03-04 Thread Lee Hachadoorian
On Thu, Mar 4, 2010 at 5:01 PM, John R Pierce pie...@hogranch.com wrote:
 are you running pgadmin and postgres server on the same computer, or on
 different computers?

Different computers.

 if different computers, is there any sort of connection tracking in between,
 such as a NAT router/gateway?

 15-20 mins sounds a lot like the typical NAT idle connection timeout...

I will have to ask the network administrator and respond.

Thanks,
--Lee

-- 
Lee Hachadoorian
PhD Student, Geography
Program in Earth  Environmental Sciences
CUNY Graduate Center

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Array columns vs normalized table

2010-03-02 Thread Lee Hachadoorian
I work with state labor data which is reported to us in the form

industry, year, quarter1, quarter2, quarter3, quarter4

where each quarter represents an employment count. Obviously, this can
be normalized to

industry, year, quarter, employment

Can anyone comment on, or point to me to an article or discussion
regarding, why one would use an array column instead of normalizing
the data? That is, would there be any benefit to storing it as

industry int, year smallint, employment int[ ]

where the last column would be a four element array with data for the
four quarters.

Thanks,
--Lee

--
Lee Hachadoorian
PhD Student, Geography
Program in Earth  Environmental Sciences
CUNY Graduate Center

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Array columns vs normalized table

2010-03-02 Thread Lee Hachadoorian
Pavel, the idea of using arrays to store long time-series data sounds
good, but I take your point that normalized tables might be better and
are easier to query. I suppose the choice will be between normalizing
or using the denormalized

industry int,
year    smallint,
emp_q1  int,
emp_q2  int,
emp_q3  int,
emp_q4  int

as suggested by Peter and Scott. We're mostly actually interested in
annual numbers, but need to preserve the quarterly data for
verification and for unusual analyses. So perhaps storing denormalized
with an additional emp_annual int field, and a view that keeps the
quarterly data out of sight.

Thanks for your replies. Please feel free to comment if you think of
anything else.

Best,
--Lee

--
Lee Hachadoorian
PhD Student, Geography
Program in Earth  Environmental Sciences
CUNY Graduate Center

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Client Encoding and Latin characters

2009-11-24 Thread Lee Hachadoorian
My database is encoded UTF8. I recently was uploading (via COPY) some
census data which included place names with ñ, é, ü, and other such
characters. The upload choked on the Latin characters. Following the
docs, I was able to fix this with:

SET CLIENT_ENCODING TO 'LATIN1';
COPY table FROM 'filename';

After which I

SET CLIENT_ENCODING TO 'UTF8';

I typically use COPY FROM to bulk load data. My question is, is there
any disadvantage to setting the default client_encoding as LATIN1? I
expect to never be dealing with Asian languages, or most of the other
LATINx languages. If I ever try to COPY FROM data incompatible with
LATIN1, the command will just choke, and I can pick an appropriate
encoding and try again, right?

Thanks,
--Lee

-- 
Lee Hachadoorian
PhD Student, Geography
Program in Earth  Environmental Sciences
CUNY Graduate Center

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Client Encoding and Latin characters

2009-11-24 Thread Lee Hachadoorian
 Uh, no.  You can pretty much assume that LATIN1 will take any random
 byte string; likewise for any other single-byte encoding.  UTF8 as a
 default is a bit safer because it's significantly more likely that it
 will be able to detect non-UTF8 input.

                        regards, tom lane


So, IIUC, the general approach is:

*Leave the default client_encoding = server_encoding (in this case UTF8)
*Rely on the client to change client_encoding on a session basis only

Thanks,
--Lee

-- 
Lee Hachadoorian
PhD Student, Geography
Program in Earth  Environmental Sciences
CUNY Graduate Center

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PostgreSQL won't start

2008-03-12 Thread Lee Hachadoorian
Last week I set up Postgres 8.3 on a WindowsXP machine.  Had it up and
running and imported data.  Now when I try to start the server (after
a machine restart) I get the message:


pg_ctl: another server might be running; trying to start server anyway
PANIC:  could not open control file global/pg_control: Permission denied

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
server starting


In the archives I've seen suggestions to make sure to start using the
postgres account (I am), make sure postgres is configured as a service
(it is), and make sure global/pg_control and the rest of the
PostgreSQL directory has read/write access (it does).  What else can I
try to start the server?

Thanks,
Lee Hachadoorian
PhD Student, Geography
Program in Earth  Environmental Sciences
CUNY Graduate Center

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL won't start

2008-03-12 Thread Lee Hachadoorian
It was far stupider than that.  I had been playing around with a
couple of different data clusters before doing a complete reinstall of
PostgreSQL.  I just realized I was trying to start a cluster that I
was no longer using and the postgres account didn't have appropriate
permissions for.  It's one of those things where once you ask the
question, you realize that the answer is implicit in the question.

But a useful followup question is, how do I make this start itself
when Windows starts?  The service is set up to run as postgres and
execute

C:\Program Files\PostgreSQL\8.3\bin\pg_ctl.exe runservice -w -N
pgsql-8.3 -D C:\Program Files\PostgreSQL\8.3\data\

This *is* pointing to the right data cluster (which I'm able to start
successfully from the command line), but it's not starting
automatically (even though it's configured to) and when I try to start
it manually within the Component Services Manager, it generates the
following error:

Error 1069: The service did not start due to a logon failure.

Thanks,
Lee Hachadoorian
PhD Student, Geography
Program in Earth  Environmental Sciences
CUNY Graduate Center

On Wed, Mar 12, 2008 at 2:05 PM, Scott Marlowe [EMAIL PROTECTED] wrote:
 On Wed, Mar 12, 2008 at 10:49 AM, Lee Hachadoorian
  [EMAIL PROTECTED] wrote:

  Last week I set up Postgres 8.3 on a WindowsXP machine.  Had it up and
running and imported data.  Now when I try to start the server (after
a machine restart) I get the message:
  

 
pg_ctl: another server might be running; trying to start server anyway
PANIC:  could not open control file global/pg_control: Permission denied

  Or it could just be that you do already have another postmaster up and
  running already.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general