Re: [GENERAL] Recurring events

2011-06-08 Thread Troy Rasiah

On 6/06/2011 8:59, Thomas Guettler wrote:

Hi,

how do you store recurring events in a database?

Selecting all events in a week/month should be fast (comming from an index).

My solution looks like this:

Table event:

Columns: id, name, recurring, start_datetime, end_datetime

recurring is weekly, monthly, yearly or NULL.

end_datetime can be NULL (open end).

Can you create an indexed view with infinite rows? I only want to index
the last three year and the next three years.

An other solution would be to fill a table with serialized events. The 
recurring
events would be created and inserted into a table. This can only be done in a 
time frame
like above (last three year, next three years). If a recurring event gets 
altered,
all its serialized events need to be updated.

Any feedback?


I have had success using the instructions at

https://github.com/bakineggs/recurring_events_for


--
Troy Rasiah


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Best Practices - Securing an Enterprise application using JBOSS Postgres

2011-06-08 Thread eyal edri
What settings would you recommend for using postgres in an enterprise
application together with jboss?

there are numerous auth options (from the documentation):

19.3.1. Trust 
authenticationhttp://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-TRUST
19.3.2. Password
authenticationhttp://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-PASSWORD
19.3.3. GSSAPI 
authenticationhttp://www.postgresql.org/docs/8.4/interactive/auth-methods.html#GSSAPI-AUTH
19.3.4. SSPI 
authenticationhttp://www.postgresql.org/docs/8.4/interactive/auth-methods.html#SSPI-AUTH
19.3.5. Kerberos
authenticationhttp://www.postgresql.org/docs/8.4/interactive/auth-methods.html#KERBEROS-AUTH
19.3.6. Ident-based
authenticationhttp://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-IDENT
19.3.7. LDAP 
authenticationhttp://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-LDAP
19.3.8. Certificate
authenticationhttp://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-CERT
19.3.9. PAM 
authenticationhttp://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-PAM
and i'm not sure that the jdbc driver even supports all of them when jboss
needs to access the db.
currently i've chosen MD5 as the auth, but is that the best option?
thanks
Eyal.


Re: [GENERAL] Best Practices - Securing an Enterprise application using JBOSS Postgres

2011-06-08 Thread John R Pierce

On 06/08/11 12:18 AM, eyal edri wrote:


currently i've chosen MD5 as the auth, but is that the best option?



thats the usual choice for JDBC apps.


--
john r pierceN 37, W 122
santa cruz ca mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Best Practices - Securing an Enterprise application using JBOSS Postgres

2011-06-08 Thread Craig Ringer

On 8/06/2011 3:18 PM, eyal edri wrote:

What settings would you recommend for using postgres in an enterprise
application together with jboss?


Most such applications have the database servers on an isolated network 
only accessible to the app server, not to the wider world. In these 
cases you'd usually limit the IP range(s) the database servers will 
accept connections from, firewall them off, and use a decent auth scheme 
like md5 or Kerberos. I suspect that most configurations use md5 auth 
for simplicity, and it's a reasonable choice.


Kerberos is certainly stronger and should be used if your database 
server and app server are not on the same machine and your network has 
Kerberos infrastructure already deployed. I wouldn't bother rolling out 
Kerberos just for PostgreSQL and PgJDBC.


In smaller configurations the database is often on the same machine as 
the appserver and set to only listen on the loopback address 
(127.0.0.1). In this case md5 auth is more than sufficient.


Because most app servers use a single username and password to connect 
to the database and provide a pool of connections, there isn't much 
advantage to using LDAP or other directory auth schemes. It's really 
intended for situations where you already have a user directory and you 
want users in it to all have direct logins to the database system. In an 
application server you'd usually configure the *app* *server* to auth 
users against LDAP, using fixed credentials unrelated to the logged in 
user for its database connections behind the scenes.


Certificate auth with SSL is useful, but probably not necessary or 
worthwhile for an app server environment.


I'd stick to md5 unless you're already used to Kerberos and have 
Kerberos infrastructure.


--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Best Practices - Securing an Enterprise application using JBOSS Postgres

2011-06-08 Thread Radosław Smogura

On Wed, 8 Jun 2011 10:18:23 +0300, eyal edri wrote:

What settings would you recommend for using postgres in an enterprise
application together with jboss?

there are numerous auth options (from the documentation):

 19.3.1. Trust authentication [1]19.3.2. Password authentication [2]
19.3.3. GSSAPI authentication [3]19.3.4. SSPI authentication [4]
19.3.5. Kerberos authentication [5]19.3.6. Ident-based
authentication [6] 19.3.7. LDAP authentication
[7]19.3.8. Certificate authentication [8] 19.3.9. PAM authentication
[9]
and im not sure that the jdbc driver even supports all of them when
jboss needs to access the db.
currently ive chosen MD5 as the auth, but is that the best option?
thanks
Eyal.

Links:
--
[1]

http://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-TRUST
[2]

http://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-PASSWORD
[3]

http://www.postgresql.org/docs/8.4/interactive/auth-methods.html#GSSAPI-AUTH
[4]

http://www.postgresql.org/docs/8.4/interactive/auth-methods.html#SSPI-AUTH
[5]

http://www.postgresql.org/docs/8.4/interactive/auth-methods.html#KERBEROS-AUTH
[6]

http://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-IDENT
[7]

http://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-LDAP
[8]

http://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-CERT
[9] 
http://www.postgresql.org/docs/8.4/interactive/auth-methods.html#AUTH-PAM



It doesn't matter so much. I actually prefer storing user 
names/password in not a system accounts (so password in db, or LDAP - 
simpler to migrate settings). Certificate maybe hard to configure on 
JBoss and other application servers (this includes if something will go 
wrong long time to restore), but I think it's most secure, as your key 
store may be encrypted.


You should actually only consider safty of storing of such passwords in 
database. If with md5 the password isn't digested like in DIGEST HTTP 
auth, and only md5 shortcut is transfferd it has no meaning if you will 
transfer over network clear password or md5 password (ok has if you use 
same password in at least two services both storing password with md5). 
On higher level you may note that MD5 is little bit out-dated and it's 
not considered secure, currently I think only SHA-256 is secure.


If you suspect that someone on your network may sniff password use cert 
auth or kerberos or one of it mutations.


Regards,
Radek

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres or Greenplum

2011-06-08 Thread Leonardo Francalanci
On 07/06/2011 23.52, Tom Lane wrote: 
 Very fast on a very narrow set of use cases ...   

Can you explain a little (if possible)?

Thank you

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] what is the best way of storing text+image documents in postgresql

2011-06-08 Thread Arash pajoohande
hello
i have a lot of files in microsoft word format. each file consists of text
and images (other text formatting like font is not important).
i want to store this documents in Postgresql, and documents must display on
web page when corresponding user requests occurs.

it seems theres 2 way to do that:
1. save .doc documents in bytea columns. and show them with a word reader in
web page (disadvantage: it needs a proper .doc reader installed on user
computer)
2. convert document to html format and store html code as string
(disadvantage: images will remain in file system)

what is your offer? any help highly would be appreciated

thanks in advance
arash


Re: [GENERAL] Recurring events

2011-06-08 Thread hubert depesz lubaczewski
On Mon, Jun 06, 2011 at 12:59:44PM +0200, Thomas Guettler wrote:
 how do you store recurring events in a database?

Check this:
http://www.justatheory.com/computers/databases/postgresql/recurring_events.html

Best regards,

depesz

-- 
The best thing about modern society is how easy it is to avoid contact with it.
 http://depesz.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres or Greenplum

2011-06-08 Thread Gabriele Bartolini

Hi Radoslaw,

On Wed, 08 Jun 2011 07:30:33 +0200, Radosław Smogura 
rsmog...@softperience.eu wrote:


But, I think GreenPlum is share nothing, isn't it?


Yes, indeed. In very simple words Greenplum is a parallel processing 
database solution that implements the shared-nothing architecture. One 
master server is responsible for managing data distribution and query 
processing among several segments - usually residing on multiple servers 
(that do not share any physical resource, but the network). The shared 
nothing architecture allows the system to be linearly scalable by adding 
commodity hardware to the cluster.


There is a special version of Greenplum (Greenplum Community Edition) 
that can be used for testing and development (the license allows usage 
in production environments as well under certain circumstances - for 
instance research). Greenplum is owned by EMC and is not open-source.


Greenplum typical usage is for data warehousing purposes, as main 
source for business intelligence and analytics query. If you are 
interested in topics like this, I suggest that you participate to 
Char(11) (www.char11.org), the second edition of the Clustering, High 
Availability and Replication conference.


Cheers,
Gabriele

--
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it - www.2ndQuadrant.it

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Abnormal long SELECT query under postgresql 9.0.4

2011-06-08 Thread Gaëtan Allart
Hi everyone,

We've just moved our website database from pg 8.4 to pg 9.0 and we found
out a very long query (that wasn't that long under 8.4).
And I actually can't explain why it's taking so much timeŠ



Here it is :

EXPLAIN ANALYZE  SELECT articles_article.id,
articles_article.name, articles_article.post_date FROM
articles_article WHERE (articles_article.permis = true  AND
articles_article.status = 2  AND articles_article.flux_id = 3107
AND NOT (articles_article.id = 784923 )) ORDER BY
articles_article.post_date DESC LIMIT 4;
   
QUERY PLAN 
   
---
---
---
 Limit  (cost=0.00..7737.95 rows=4 width=53) (actual
time=440296.783..448149.439 rows=4 loops=1)
   -  Index Scan Backward using index_articles_post on articles_article
(cost=0.00..7233050.00 rows=3739 width=53) (actual
time=440296.783..448149.437 rows=4 loops=1)
 Filter: (permis AND (id  784923) AND (status = 2) AND (flux_id
= 3107))
 Total runtime: 448149.477 ms
(4 rows)






Then, if I remove the ORDER BY clause, this is much much better :

EXPLAIN ANALYZE  SELECT articles_article.id,
articles_article.name, articles_article.post_date FROM
articles_article WHERE (articles_article.permis = true  AND
articles_article.status = 2  AND articles_article.flux_id = 3107
AND NOT (articles_article.id = 784923 )) LIMIT 4;
QUERY PLAN 
   
---
---
---
 Limit  (cost=0.00..8.25 rows=4 width=53) (actual time=1.472..74.679
rows=4 loops=1)
   -  Index Scan using articles_article_flux_id on articles_article
(cost=0.00..7710.04 rows=3739 width=53) (actual time=1.468..74.672 rows=4
loops=1)
 Index Cond: (flux_id = 3107)
 Filter: (permis AND (id  784923) AND (status = 2))
 Total runtime: 74.779 ms
(5 rows)




Same query under 8.4 :

EXPLAIN ANALYZE SELECT articles_article.id, articles_article.name,
articles_article.post_date FROM articles_article WHERE
(articles_article.permis = true AND articles_article.status = 2
AND articles_article.flux_id = 3107 AND NOT (articles_article.id =
784923 )) ORDER BY articles_article.post_date DESC LIMIT 4;
   QUERY PLAN
   
---
---
-- Limit  (cost=693.11..693.12 rows=4 width=53) (actual
time=12.617..12.619 rows=4 loops=1)   -  Sort  (cost=693.11..693.94
rows=330 width=53) (actual time=12.615..12.616 rows=4 loops=1)
Sort Key: post_date Sort Method:  top-N heapsort  Memory: 25kB
-  Index Scan using articles_article_flux_id on articles_article
(cost=0.00..688.16 rows=330 width=53) (actual time=0.034..12.255 rows=1072
loops=1)
   Index Cond: (flux_id = 3107)
   Filter: (permis AND (id  784923) AND (status = 2))
 Total runtime: 12.672 ms
(8 rows)






Would anyone know where this might come from? 448 secondes for a basic
SELECT sounds amazing oO

Thanks,

Gaëtan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] what is the best way of storing text+image documents in postgresql

2011-06-08 Thread Craig Ringer

On 06/08/2011 06:13 PM, Arash pajoohande wrote:


1. save .doc documents in bytea columns. and show them with a word
reader in web page (disadvantage: it needs a proper .doc reader
installed on user computer)


1a: Convert the .doc files to a standard format like PDF that most 
browsers can display. That's what I'd do.



2. convert document to html format and store html code as string
(disadvantage: images will remain in file system)


You could always store images as 'bytea' fields on a subtable.

CREATE TABLE doc (
  id serial primary key,
  doc_html text,
  ...
);

CREATE TABLE doc_images (
  id serial primary key,
  doc_id integer references doc(id),
  image_data bytea not null
);

There are advantages and disadvantages to storing files in the database 
vs in the file system. It's been discussed to death recently, so see the 
mailing list archives.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [bulk] Re: [GENERAL] Non returning Transactions/Many Locks in Postgres 9.0.4 and 9.0.1

2011-06-08 Thread Tarabas
Hello,

the problem just resurfaced and the Wiki page dows not really help
very much.

When i look into the pg_stat_activity, i only see that the connections
all state IDLE in transaction.

Is there any way to find out what the transaction is doing exactly to
be able to debug the Problem?

Best regards
Manuel


Tuesday, June 7, 2011, 5:50:13 PM, you wrote:

T Hi Scott,

SM Snip.  Those are ALL either AccessShareLock (which is very low level
SM and non-blocking) or virtual tx locks, which again don't block
SM anything but their own transaction.  Nothing there screams locks!
SM for a better view of locks and how they're blocking things you can use
SM the queries from here: http://wiki.postgresql.org/wiki/Lock_Monitoring

T Thank you, I will try that. I am currently using Hibernate 3.2 as a
T Database-Layer but with an older JDBC for 8.2, i will try updating
T these components to more recent versions.

T If the problem arises again i will hopefully be able to better
T analyze, where the locking in my application occurs. It is in fact a
T multi-threaded environment.

T Best regards
T Manuel



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [bulk] Re: [GENERAL] Non returning Transactions/Many Locks in Postgres 9.0.4 and 9.0.1

2011-06-08 Thread Merlin Moncure
On Wed, Jun 8, 2011 at 9:32 AM, Tarabas tara...@tarabas.de wrote:
 Hello,

 the problem just resurfaced and the Wiki page dows not really help
 very much.

 When i look into the pg_stat_activity, i only see that the connections
 all state IDLE in transaction.

 Is there any way to find out what the transaction is doing exactly to
 be able to debug the Problem?

The transactions aren't doing anything.  Your applciation began
transactions in your application but didn't commit them.  Classic
causes of this are:
1) straight up bugs (code branches that do not commit)
2) multi threaded code, especially when threads share connections
(basically a factory for #1 above)
3) busted connection poolers (php_pconnect)

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Abnormal long SELECT query under postgresql 9.0.4

2011-06-08 Thread Radosław Smogura
Gaëtan Allart gae...@nexylan.com Wednesday 08 of June 2011 14:59:05
 Hi everyone,
 
 We've just moved our website database from pg 8.4 to pg 9.0 and we found
 out a very long query (that wasn't that long under 8.4).
 And I actually can't explain why it's taking so much timeŠ
 
 
 
 Here it is :
 
 EXPLAIN ANALYZE  SELECT articles_article.id,
 articles_article.name, articles_article.post_date FROM
 articles_article WHERE (articles_article.permis = true  AND
 articles_article.status = 2  AND articles_article.flux_id = 3107
 AND NOT (articles_article.id = 784923 )) ORDER BY
 articles_article.post_date DESC LIMIT 4;
 
 QUERY PLAN
 
 ---
 ---
 ---
  Limit  (cost=0.00..7737.95 rows=4 width=53) (actual
 time=440296.783..448149.439 rows=4 loops=1)
-  Index Scan Backward using index_articles_post on articles_article
 (cost=0.00..7233050.00 rows=3739 width=53) (actual
 time=440296.783..448149.437 rows=4 loops=1)
  Filter: (permis AND (id  784923) AND (status = 2) AND (flux_id
 = 3107))
  Total runtime: 448149.477 ms
 (4 rows)
 
 
 
 
 
 
 Then, if I remove the ORDER BY clause, this is much much better :
 
 EXPLAIN ANALYZE  SELECT articles_article.id,
 articles_article.name, articles_article.post_date FROM
 articles_article WHERE (articles_article.permis = true  AND
 articles_article.status = 2  AND articles_article.flux_id = 3107
 AND NOT (articles_article.id = 784923 )) LIMIT 4;
 QUERY PLAN
 
 ---
 ---
 ---
  Limit  (cost=0.00..8.25 rows=4 width=53) (actual time=1.472..74.679
 rows=4 loops=1)
-  Index Scan using articles_article_flux_id on articles_article
 (cost=0.00..7710.04 rows=3739 width=53) (actual time=1.468..74.672 rows=4
 loops=1)
  Index Cond: (flux_id = 3107)
  Filter: (permis AND (id  784923) AND (status = 2))
  Total runtime: 74.779 ms
 (5 rows)
 
 
 
 
 Same query under 8.4 :
 
 EXPLAIN ANALYZE SELECT articles_article.id, articles_article.name,
 articles_article.post_date FROM articles_article WHERE
 (articles_article.permis = true AND articles_article.status = 2
 AND articles_article.flux_id = 3107 AND NOT (articles_article.id =
 784923 )) ORDER BY articles_article.post_date DESC LIMIT 4;
QUERY PLAN
 
 ---
 ---
 -- Limit  (cost=693.11..693.12 rows=4 width=53) (actual
 time=12.617..12.619 rows=4 loops=1)   -  Sort  (cost=693.11..693.94
 rows=330 width=53) (actual time=12.615..12.616 rows=4 loops=1)
 Sort Key: post_date Sort Method:  top-N heapsort  Memory: 25kB
 -  Index Scan using articles_article_flux_id on articles_article
 (cost=0.00..688.16 rows=330 width=53) (actual time=0.034..12.255 rows=1072
 loops=1)
Index Cond: (flux_id = 3107)
Filter: (permis AND (id  784923) AND (status = 2))
  Total runtime: 12.672 ms
 (8 rows)
 
 
 
 
 
 
 Would anyone know where this might come from? 448 secondes for a basic
 SELECT sounds amazing oO
 
 Thanks,
 
 Gaëtan

Did You ran analyze or vacuum after migration?

Best regards,
Radek

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL]

2011-06-08 Thread Callum Scott
http://alkiosco.com/lindex02.html

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Best Practices - Securing an Enterprise application using JBOSS Postgres

2011-06-08 Thread Isak Hansen
On Wed, Jun 8, 2011 at 11:43 AM, Radosław Smogura
rsmog...@softperience.eu wrote:

 You should actually only consider safty of storing of such passwords in
 database. If with md5 the password isn't digested like in DIGEST HTTP auth,
 and only md5 shortcut is transfferd it has no meaning if you will transfer
 over network clear password or md5 password (ok has if you use same password
 in at least two services both storing password with md5). On higher level
 you may note that MD5 is little bit out-dated and it's not considered
 secure, currently I think only SHA-256 is secure.

 If you suspect that someone on your network may sniff password use cert auth
 or kerberos or one of it mutations.

While MD5 is considered broken for certain applications, it's still
perfectly valid for auth purposes.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Converting uuid primary key column to serial int

2011-06-08 Thread Mike Christensen
Hi all -

For most of my database I use UUIDs as primary keys because, well, I
just like it better and like being able to generate a key in the
middle tier when I create new data.  However, I have one table that
has a very fixed and immutable set of data with a few thousand
ingredients in it.  This set very rarely changes (in fact when I
change it, I have to restart the whole website)..  The UUIDs for each
ingredient are kinda a hassle to manage though, and I'm thinking in
this case I might just want to use a 32bit integer for the primary
key..

Almost every other table in the database has FK relationships with
these ingredients, and there's all sorts of JOINs against it.  So,
making this change is gonna be fairly costly and require all sorts of
schema changes to everything.  I have a few related questions on this
topic..

1) What's the best way to convert an existing table from UUIDs to a
serial INT column?  I want to have Postgres just assign sequential
numbers starting at 1 to all the rows.

2) Once I do this, I obviously need to fix up all the foreign keys and
convert them from the old UUIDs to the new generated numeric value.
Is there an easy way to manage this sort of thing?  I'm thinking about
keeping the old UUID as a non-PK column to use as a mapping as I
migrate the rest of the schema over.  Is that a good approach?

3) I'm also worried about backup/restores/replication.  I often add
new ingredients to my production database, and every so often backup
this database and restore it to my dev box so I can work with
production data.  I need to make sure all these ingredient IDs stay
constant, just as the UUIDs did.  Does pg_dump generate SQL that will
re-create the existing keys and set the identity watermark
correctly?  If someday I have replication enabled, do the same IDs get
replicated consistantly?  Any other gotchas here?

Mike

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] what is the best way of storing text+image documents in postgresql

2011-06-08 Thread John R Pierce

On 06/08/11 6:06 AM, Craig Ringer wrote:

1. save .doc documents in bytea columns. and show them with a word
reader in web page (disadvantage: it needs a proper .doc reader
installed on user computer)


1a: Convert the .doc files to a standard format like PDF that most 
browsers can display. That's what I'd do. 


thats harder to integrate with a website in the sense that the PDF 
documents are hard page formatted, and can at best be displayed in an 
iframe within your site, and half the time, only displayed in an 
external PDF viewer since browser-pdf integration remains flakey and 
bugridden after all these years.   PDF text won't flow to fit your page 
layout, etc etc.


one approach to conversion might be to save the documents as an RTF type 
format, and run that through a preprocessor that reencodes them as a 
clean HTML or similar metalanguage that you can deal with intelligently. 
   MS Word's own HTML converter creates wretched HTML with tons of 
extra bizarro-world tags which likely will trip up your page formatting 
if you display these in context in your pages.


as Craig suggested, images can be stored as bytea objects, and the image 
links converted to point to a CGI that fetches them from the database 
and presents them to the client browser.


--
john r pierceN 37, W 122
santa cruz ca mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Converting uuid primary key column to serial int

2011-06-08 Thread David Johnston
 -Original Message-
 From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-
 ow...@postgresql.org] On Behalf Of Mike Christensen
 Sent: Wednesday, June 08, 2011 2:57 PM
 To: pgsql-general@postgresql.org
 Subject: [GENERAL] Converting uuid primary key column to serial int

 for each ingredient are kinda a hassle to manage though, and I'm thinking
in
 this case I might just want to use a 32bit integer for the primary key..
 
 2) Once I do this, I obviously need to fix up all the foreign keys and
convert
 them from the old UUIDs to the new generated numeric value.
 Is there an easy way to manage this sort of thing?  I'm thinking about
keeping
 the old UUID as a non-PK column to use as a mapping as I migrate the
rest
 of the schema over.  Is that a good approach?
 

What specific hassles are you encountering?  Seems like a lot of effort to
go through to make the data less-unique.  Since you already have the field
setup for PK/FK use try to leave that alone and just add a secondary
identifier that you use for queries.  Like you said in #2 leave the UUID in
place as a non-PK (but still unique/not-null) field and leave all the
foreign keys in place as well.  Add your integer identity column as the
new Primary Key and use that field when you want to specify a record.  Then,
for those cases where you have or want to use the integer PK but the field
being compared is the UUID you can write a simple function to return the
UUID associated with the given integer.

Otherwise you probably should just add the integer field to EVERY TABLE and
establish relational links.  Add the field as null-able, perform the update
using the UUID, change to not null, add FK constraint.  As you then attempt
to remove the UUID field from the tables the system will tell you where
different function and views are relying upon the UUID field and you can
change the object to use the new integer field instead.

The main risk really is in the application since the database will not be
able to enforce consistency.

Serial is implemented via DEFAULT and sequences; and if you dump/restore a
sequence it is done consistently.

David J.










-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Converting uuid primary key column to serial int

2011-06-08 Thread Mike Christensen
On Wed, Jun 8, 2011 at 1:06 PM, David Johnston pol...@yahoo.com wrote:
 -Original Message-
 From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-
 ow...@postgresql.org] On Behalf Of Mike Christensen
 Sent: Wednesday, June 08, 2011 2:57 PM
 To: pgsql-general@postgresql.org
 Subject: [GENERAL] Converting uuid primary key column to serial int

 for each ingredient are kinda a hassle to manage though, and I'm thinking
 in
 this case I might just want to use a 32bit integer for the primary key..

 2) Once I do this, I obviously need to fix up all the foreign keys and
 convert
 them from the old UUIDs to the new generated numeric value.
 Is there an easy way to manage this sort of thing?  I'm thinking about
 keeping
 the old UUID as a non-PK column to use as a mapping as I migrate the
 rest
 of the schema over.  Is that a good approach?


 What specific hassles are you encountering?  Seems like a lot of effort to
 go through to make the data less-unique.  Since you already have the field
 setup for PK/FK use try to leave that alone and just add a secondary
 identifier that you use for queries.  Like you said in #2 leave the UUID in
 place as a non-PK (but still unique/not-null) field and leave all the
 foreign keys in place as well.  Add your integer identity column as the
 new Primary Key and use that field when you want to specify a record.  Then,
 for those cases where you have or want to use the integer PK but the field
 being compared is the UUID you can write a simple function to return the
 UUID associated with the given integer.

 Otherwise you probably should just add the integer field to EVERY TABLE and
 establish relational links.  Add the field as null-able, perform the update
 using the UUID, change to not null, add FK constraint.  As you then attempt
 to remove the UUID field from the tables the system will tell you where
 different function and views are relying upon the UUID field and you can
 change the object to use the new integer field instead.

 The main risk really is in the application since the database will not be
 able to enforce consistency.

 Serial is implemented via DEFAULT and sequences; and if you dump/restore a
 sequence it is done consistently.

All very good points.  A few reasons why I'm considering this:

1) On some pages, the user can pass in a search query (keywords to
find, ingredients to exclude, etc)..  I have a Base64 representation
of this search query, and it gets pretty long if a bunch of UUIDs are
included.  Making them ints would make for much shorter URLs.  There's
a few other pages that have an ingredient ID URL parameter directly.
I think URLs with UUIDs are kinda ugly.

2) I deal with a lot of this data through internal web based admin
tools as well as Excel.  There's a lot of places where I need to link
some metadata to an existing ingredient.  I'd like to just be able to
refer to eggs as, say, 53, rather than some UUID for eggs.  Sure,
the tools could have auto-complete and auto-lookup stuff but I haven't
had much time to work on these tools so they're super hacky right now.

Both of these requirements, as you said, could be satisfied by using a
non-primary key though.  In fact, there's then no reason to change any
other tables - internally the database would link to the ingredient
UUIDs.

I'm assuming I can still have a Serial column that is NOT a primary
key, and it'll incremement just the same as I add rows?  If that's the
case, I think that's a superior approach..

BTW, this table is too small to worry about disk space of UUIDs and/or
perhaps any sort of performance benefits to using int over uuid (if
there are any)..

Mike

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Converting uuid primary key column to serial int

2011-06-08 Thread David Johnston
 -Original Message-
 From: Mike Christensen [mailto:m...@kitchenpc.com]
 Sent: Wednesday, June 08, 2011 4:26 PM
 To: David Johnston; pgsql-general@postgresql.org
 Subject: Re: [GENERAL] Converting uuid primary key column to serial int


 I'm assuming I can still have a Serial column that is NOT a primary key,
and
 it'll incremement just the same as I add rows?  If that's the case, I
think that's
 a superior approach..
 
 BTW, this table is too small to worry about disk space of UUIDs and/or
 perhaps any sort of performance benefits to using int over uuid (if there
are
 any)..
 
 Mike

 CREATE TABLE t ( field serial );  is simply short-hand for  CREATE TABLE
t (field integer DEFAULT nextval('t_seq') );  where the sequence t_seq is
automatically created and linked to the table.

Nothing more and nothing less.  Whether you add NOT NULL or, by extension,
PRIMARY KEY, to the field as well as the serial datatype depends on
whether you want to have those other properties.

Have you considered giving the row for eggs the PK of eggs?  You did say
you have multiple thousands of records but neither the UUID or the integer
is going to stop you from then having 2+ records with eggs as the name.
If you want to see how many recipes use eggs what would you do to make
sure you are not missing any?  Even if you decide to keep the UUID and/or
Integer as UNIQUE indices you should try and have something in the data
itself that can be defined as UNIQUE.  Since you are dealing with discreet
items, without any kind of time property, it should be possible to do so.

From an implementation perspective you will want to create the sequence and
all using serial but allow NULL for the field.  Once you've assigned all
the existing records an ID (probably via the row() window function) you can
setup the sequence to begin with the next available number.  See docs for
syntax.

David J.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] what is the best way of storing text+image documents in postgresql

2011-06-08 Thread Tomas Vondra
Dne 8.6.2011 21:37, John R Pierce napsal(a):
 On 06/08/11 6:06 AM, Craig Ringer wrote:
 1. save .doc documents in bytea columns. and show them with a word
 reader in web page (disadvantage: it needs a proper .doc reader
 installed on user computer)

 1a: Convert the .doc files to a standard format like PDF that most
 browsers can display. That's what I'd do. 
 
 thats harder to integrate with a website in the sense that the PDF
 documents are hard page formatted, and can at best be displayed in an
 iframe within your site, and half the time, only displayed in an
 external PDF viewer since browser-pdf integration remains flakey and
 bugridden after all these years.   PDF text won't flow to fit your page
 layout, etc etc.

OTOH, the probability that the visitor has a PDF reader is much higher.
Plus the PDF is usually much easier to index etc. But yes, using this to
build a website is PITA.

 one approach to conversion might be to save the documents as an RTF type
 format, and run that through a preprocessor that reencodes them as a
 clean HTML or similar metalanguage that you can deal with intelligently.
MS Word's own HTML converter creates wretched HTML with tons of extra
 bizarro-world tags which likely will trip up your page formatting if you
 display these in context in your pages.

You could as well run htmltidy or something like that on the HTML. But
in both cases, this will seriously damage the formatting. So if the OP
wants to preserve it, the only viable solution is to keep the .doc
format or convert it to a .pdf and pray there's a working viewer.

Or you can convert the PDF into images (convert from imagemagick can
do that quite easily), display those images on the web and offer the PDF
for download.

regards
Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Converting uuid primary key column to serial int

2011-06-08 Thread Mike Christensen
 I'm assuming I can still have a Serial column that is NOT a primary key,
 and
 it'll incremement just the same as I add rows?  If that's the case, I
 think that's
 a superior approach..

 BTW, this table is too small to worry about disk space of UUIDs and/or
 perhaps any sort of performance benefits to using int over uuid (if there
 are
 any)..

 Mike

  CREATE TABLE t ( field serial );  is simply short-hand for  CREATE TABLE
 t (field integer DEFAULT nextval('t_seq') );  where the sequence t_seq is
 automatically created and linked to the table.

 Nothing more and nothing less.  Whether you add NOT NULL or, by extension,
 PRIMARY KEY, to the field as well as the serial datatype depends on
 whether you want to have those other properties.

 Have you considered giving the row for eggs the PK of eggs?  You did say
 you have multiple thousands of records but neither the UUID or the integer
 is going to stop you from then having 2+ records with eggs as the name.
 If you want to see how many recipes use eggs what would you do to make
 sure you are not missing any?  Even if you decide to keep the UUID and/or
 Integer as UNIQUE indices you should try and have something in the data
 itself that can be defined as UNIQUE.  Since you are dealing with discreet
 items, without any kind of time property, it should be possible to do so.

 From an implementation perspective you will want to create the sequence and
 all using serial but allow NULL for the field.  Once you've assigned all
 the existing records an ID (probably via the row() window function) you can
 setup the sequence to begin with the next available number.  See docs for
 syntax.

Yea, there's already a unique index on the ingredient name, so there
can only be 1 eggs row..  I'm not too sure on using the name as the
primary key..

The serial implementation makes sense..  If that's all that it does, I
should be able to just update all rows and set the numeric value to
nextval('t_seq') directly to order all my existing rows, then set the
column to NOT NULL when I'm done, and set the sequence to the next
available number..  Worst case I'll just write a little function that
loops through my rows and numbers them all.  I'll checkout the docs on
this, thanks!

Mike

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Converting uuid primary key column to serial int

2011-06-08 Thread Adrian Klaver
On Wednesday, June 08, 2011 1:40:35 pm David Johnston wrote:
  -Original Message-
  From: Mike Christensen [mailto:m...@kitchenpc.com]
  Sent: Wednesday, June 08, 2011 4:26 PM
  To: David Johnston; pgsql-general@postgresql.org
  Subject: Re: [GENERAL] Converting uuid primary key column to serial int
  
  
  I'm assuming I can still have a Serial column that is NOT a primary
  key,
 
 and
 
  it'll incremement just the same as I add rows?  If that's the case, I
 
 think that's
 
  a superior approach..
  
  BTW, this table is too small to worry about disk space of UUIDs and/or
  perhaps any sort of performance benefits to using int over uuid (if there
 
 are
 
  any)..
  
  Mike
 
  CREATE TABLE t ( field serial );  is simply short-hand for  CREATE
 TABLE t (field integer DEFAULT nextval('t_seq') );  where the sequence
 t_seq is automatically created and linked to the table.
 

Actually per the docs and for completeness sake:

http://www.postgresql.org/docs/9.0/interactive/datatype-numeric.html#DATATYPE-
SERIAL

CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;
So the NOT NULL is already included.


 
 David J.

-- 
Adrian Klaver
adrian.kla...@gmail.com


Re: [GENERAL] Best Practices - Securing an Enterprise application using JBOSS Postgres

2011-06-08 Thread Craig Ringer
On 09/06/11 03:07, Isak Hansen wrote:

 While MD5 is considered broken for certain applications, it's still
 perfectly valid for auth purposes.

MD5 rainbow tables can be calculated quickly using services easily
available to anyone (eg: EC2) and rainbow tables for passwords up to 8
chars have been successfully used in demo and real attacks several times
in the last year. It's looking pretty shakey.

That said, _properly_ _salted_ md5 is still likely to be strong enough
for most people's likely attack scenarios for quite some time to come.
It's only unsalted md5 that's dangerously stupid to use now - and it was
never exactly a good idea.

If you do your own user/password storage with a users table in the
database or whatever, make sure you salt the passwords for encryption.

--
Craig Ringer


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general