Hi,
~
I am trying to get dups from some data from files which md5sums I
previously calculated
~
Here is my mere mortal SQL
~
SELECT md5, COUNT(md5) AS md5cnt
FROM jdk1_6_0_07_txtfls_md5
WHERE (md5cnt 1)
GROUP BY md5
ORDER BY md5cnt DESC;
~
and this is what I get:
~
jpk=# SELECT md5,
On Thu, Aug 28, 2008 at 7:50 PM, Adrian Klaver [EMAIL PROTECTED] wrote:
Define easily.
~
OK, let me try to outline the approach I would go for:
~
I think COPY FROM CSV should have three options, namely:
~
1) the way we have used it in which you create the table first
~
2) another way in which
On Sat, Aug 30, 2008 at 08:23:25AM -0400, Albretch Mueller wrote:
OK, let me try to outline the approach I would go for:
~
I think COPY FROM CSV should have three options, namely:
I think you're confusing postgresql with a spreadsheet program. A
database is designed to take care of your data
Also I know there is a DISTINCT keyword, but I also need to know how
many times the particular data in the column is repeated if it is,
that is why I need to go:
~
SELECT md5, COUNT(md5) AS md5cnt
FROM jdk1_6_0_07_txtfls_md5
WHERE (md5cnt 1)
GROUP BY md5
ORDER BY md5cnt DESC;
~
Thanks
I think you're confusing postgresql with a spreadsheet program.
~
I wonder what makes you think so
~
There are client programs which will do this for you, perhaps you wan one of
those?
~
Well, then obviously there is the need for it and you were not
successful enough at convincing these
Albretch Mueller wrote:
Hi,
~
I am trying to get dups from some data from files which md5sums I
previously calculated
~
Here is my mere mortal SQL
~
SELECT md5, COUNT(md5) AS md5cnt
FROM jdk1_6_0_07_txtfls_md5
WHERE (md5cnt 1)
GROUP BY md5
ORDER BY md5cnt DESC;
I think you are looking for
On Aug 30, 2008, at 6:26 AM, Albretch Mueller wrote:
Well, then obviously there is the need for it and you were not
successful enough at convincing these developers that they were
confusing postgresql with a spreadsheet program
The behavior you are looking for is typical of a spreadsheet,
On Saturday 30 August 2008 5:23:25 am Albretch Mueller wrote:
On Thu, Aug 28, 2008 at 7:50 PM, Adrian Klaver [EMAIL PROTECTED] wrote:
Define easily.
~
OK, let me try to outline the approach I would go for:
~
I think COPY FROM CSV should have three options, namely:
~
1) the way we have
On Aug 30, 2008, at 9:19 AM, Christophe wrote:
On Aug 30, 2008, at 6:26 AM, Albretch Mueller wrote:
Well, then obviously there is the need for it and you were not
successful enough at convincing these developers that they were
confusing postgresql with a spreadsheet program
The behavior
thank you Stefan your SQL worked, but still; I am just asking and my
programming bias will certainly show, but aren't you effectivly
calling count on the table three times if you go:
~
SELECT md5, COUNT(md5)
FROM jdk1_6_0_07_txtfls_md5
GROUP BY md5
HAVING COUNT(md5) 1
ORDER BY COUNT(md5) DESC;
~
Albretch Mueller [EMAIL PROTECTED] writes:
thank you Stefan your SQL worked, but still; I am just asking and my
programming bias will certainly show, but aren't you effectivly
calling count on the table three times if you go:
The system is smart enough to only do the count() once.
spreadsheet programs (generally; I'm sure there are exceptions) don't have
the notion of a schema; each cell can hold its own particular type.
~
Oh, now I see what Martin meant!
~
that's not a traditional part of a database engine.
~
well, yeah! I would totally agree with you, but since I
The system is smart enough to only do the count() once.
~
But not smart enough to make a variable you declare point to that
internal variable so that things are clearer/ easier ;-)
~
Thanks
lbrtchx
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your
On Saturday 30 August 2008 9:42:19 am Adrian Klaver wrote:
On Saturday 30 August 2008 5:23:25 am Albretch Mueller wrote:
On Thu, Aug 28, 2008 at 7:50 PM, Adrian Klaver [EMAIL PROTECTED]
wrote:
Define easily.
~
OK, let me try to outline the approach I would go for:
~
I think COPY
On Sat, Aug 30, 2008 at 01:36:25PM -0400, Albretch Mueller wrote:
The system is smart enough to only do the count() once.
~
But not smart enough to make a variable you declare point to that
internal variable so that things are clearer/ easier ;-)
The SQL standard has pretty clear rules
On Aug 30, 2008, at 10:33 AM, Albretch Mueller wrote:
well, yeah! I would totally agree with you, but since I doubt very
much COPY FROM CSV is part of the SQL standard to beging with, why
not spice it up a little more?
I'd guess that coming up with a general algorithm to guess the type
On Thu, Aug 28, 2008 at 7:45 PM, Matthew Dennis [EMAIL PROTECTED] wrote:
Another question though. Since I could potentially start transaction, drop
indexes/checks, replace function, create indexes/checks, commit tranasaction
could I deal with the case of the constant folding into the cached
... are times local or UTC
~
this is a rather semantic, not a syntactic issue that some code could
NOT decide based on the data it reads
~
Should we assume integer or float?
~
is a dot anywhere in the data you read in for that particular column? ...
~
Varchar or text?
~
Is the length of the
You have made clear to me why my attempt for a RFE for COPY FROM CVS
has found some technical resistance/disagreement, but I still think my
idea even if not so popular for concrete and cultural reasons makes at
least sense to some people
It's a perfectly reasonable problem to want to solve;
19 matches
Mail list logo