Re: Find duplicates in a column then print Information to a file

Martin Moore Sun, 31 Dec 2017 10:25:22 -0800

 

From: Sherman Willden <[email protected]>
Date: Sunday, 31 December 2017 at 18:19
To: <[email protected]>
Subject: Find duplicates in a column then print Information to a file


 

Development Platform: Ubuntu 17.10 mainly command line work

Tools: perl 5.26 and postgresql 9.6

Goal: Display duplicate aria titles on screen and to a local file

Database name: arias

Table name: aria_precis

csv delimiter: the # symbol

arias=# \d aria_precis
   Table "public.aria_precis"
   Column    | Type | Modifiers
-------------+------+-----------
 id          | text | not null
 aria        | text |
 artist      | text |
 a_artist    | text |
 album_title | text |
Indexes:
    "aria_precis_pkey" PRIMARY KEY, btree (id)

Problems:
  1. Can't connect aria title to id
  2. Can't write discovered information to file

I know about this link but I probably didn't understand what I read there.
https://www.postgresql.org/docs/current/static/sql-select.html#SQL-WITH

What I know displays the 46 duplicate aria titles of 413 entries but not the id 
or artist:
SELECT aria FROM aria_precis WHERE aria IN (SELECT aria FROM aria_precis GROUP 
BY aria  HAVING COUNT(aria)>1);

When I perform the following I get (0 rows):
SELECT aria FROM aria_precis WHERE aria IN (SELECT aria FROM aria_precis GROUP 
BY id, aria  HAVING COUNT(aria)>1);
 aria
------
(0 rows)

After I get the above information how do I write that information to a file? 
The only thing I know writes the entire database to a file:
 \COPY aria_precis TO '/home/sherman/aria_precis_2.csv' WITH DELIMITER '#,' CSV 
HEADER;

Thank you;

Sherman

========================================

 

 

On the cmdline just do:

 

psql arias <pwd> -c ‘SELECT aria FROM aria_precis WHERE aria IN (SELECT aria 
FROM aria_precis GROUP BY aria  HAVING COUNT(aria)>1)‘   > outfile.txt

 

 

Martin.

Re: Find duplicates in a column then print Information to a file

Reply via email to