thank you so much! After running it I find it only finds the duplicates in
~. I need to find the duplicates across all the directories under home.
after looking at the man file and searching for recu it seems it recurses
by default unless I am reading it wrong.
I tried the uniq command but:

 uniq -c -d -w list.of.files
 uniq: list.of.files: invalid number of bytes to compare

isn't uniq used to find the differences between two files? I have
a very rudimentary understanding of linux so I'm sure I'm wrong

all the files in list.of.files are invisible files. (prefaced with a
period))
and isn't there a way to sort things depending on their column (column1
md5sum, column2 file name)

On Mon, Sep 30, 2024 at 2:56 AM Rusty Carruth via PLUG-discuss <
plug-discuss@lists.phxlinux.org> wrote:

>
> On 9/28/24 21:06, Michael via PLUG-discuss wrote:
> > About a year ago I messed up by accidently copying a folder  with other
> > folders into another folder. I'm running out of room and need to find
> that
> > directory tree and get rid of it. All I know for certain is that it is
> > somewhere in my home directory. I THINK it is my pictures directory with
> > ARW files.
> > chatgpt told me to use fdupes but it told me to use an exclude option
> > (which I found out it doesn't have) to avoid config files (and I was
> > planning on adding to that as I discovered other stuff I didn't want).
> then
> > it told me to use find but I got an error which leads me to believe it
> > doesn't know what it's talking about!
> > coul;d someone help me out?
> >
> First, someone said you need to run updatedb before running find.  No,
> sorry, updatedb is for using locate, not find.  Find actively walks the
> directory tree.  Locate searches the text (I think) database built by
> updatedb.
>
>
> Ok, now to answer the question.  I've got a similar situation, but in
> spades.  Every time I did a backup, I did an entire copy of everything,
> so I've got ... oh, 10, 20, 30 copies of many things. I'm working on
> scripts to help reduce that, but for now doing it somewhat manually, I
> suggest the following command:
>
>
> cd (the directory of interest, possibly your home dir) ; find . -type f
> -print0 | xargs -0 md5sum | sort > list.of.files
>
> this will create a list of files, sorted by their md5sum.  If you want
> to be lazy and not search that file for duplicate md5sums, consider
> uniq.  Like this:
>
> uniq -c -d -w list.of.files
>
>
> This will print the list of files which are duplicates.  For example,
> out of a list of 42,279 files in a certain directory on my computer,
> here's the result:
>
>        2 73d249df037f6e63022e5cfa8d0c959b
>
> _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160321-223138.png
>        5 9b162ac35214691461cc0f0104fb91ce
> _files/melissa/Documents/EPHESUS/Office Stuff/SPD/SPD SUMMER 2016 (1).pdf
>        3 b396af67f2cd75658397efd878a01fb8
> _files/dads_zipdisks/2003-1/CLASS at VBC Sp-03/CLASS BKUP - Music
> Reading & Sight Singing Class/C  & D Major & Minor Scales & Chords.mct
>        2 cd83094e0c4aeb9128806b5168444578
>
> _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160318-222051.png
>        2 d1a5a1bec046cc85a3a3fd53a8d5be86
>
> _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160410-145331.png
>        2 fa681c54a2bd7cfa590ddb8cf6ca1cea
>
> _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160312-113340.png
>
> Originally the _files directory had MANY duplicates, now I've managed to
> get that down to the above list...
>
> Anyway, there you go.  Happy scripting.
>
> ---------------------------------------------------
> PLUG-discuss mailing list: PLUG-discuss@lists.phxlinux.org
> To subscribe, unsubscribe, or to change your mail settings:
> https://lists.phxlinux.org/mailman/listinfo/plug-discuss
>


-- 
:-)~MIKE~(-:
---------------------------------------------------
PLUG-discuss mailing list: PLUG-discuss@lists.phxlinux.org
To subscribe, unsubscribe, or to change your mail settings:
https://lists.phxlinux.org/mailman/listinfo/plug-discuss

Reply via email to