Thanks for your help... I dont have access to the computer now but I'll let
you know if it works out when I do.

On Mon, Sep 30, 2024, 1:16 PM Rusty Carruth via PLUG-discuss <
plug-discuss@lists.phxlinux.org> wrote:

> Oops, you are correct, the uniq command should have -w 34 list.of.files
> not -w list.of.files.  Sorry!  (here's what I'd typed and what I should
> have cut/pasted:
>
> root@rusty-MS-7851:/backups1/backup_system_v2# uniq -c -d -w 34
> sorted.new_filesA.md5|less ; wc -l sorted.new_filesA.md5
> 42279 sorted.new_filesA.md5
> root@rusty-MS-7851:/backups1/backup_system_v2# uniq -c -d -w 34
> sorted.new_filesA.md5
>
> sorry again!)
>
>
> Also, if you want to get a list of files and their MD5 sums from 'higher
> up' in the directory tree, just change the starting directory in your
> find command to that higher up location. However, you might need to run
> the entire find and md5sum sequence as root, if the directories (and
> files) you care about don't have read permission for you.  (so, to find
> ALL files everywhere on your computer, change the ~ to /. You'll
> certainly get lots of permission denied errors if you do that as
> yourself and not root. But starting at / will traverse ALL directories
> on your computer, including /dev, and others you probably don't care
> about.  There are some useful options to find (like, don't go to a
> different filesystem) you might want to use, see man page for find to
> find them ;-)
>
> On 9/30/24 07:05, Michael via PLUG-discuss wrote:
> > thank you so much! After running it I find it only finds the duplicates
> in
> > ~. I need to find the duplicates across all the directories under home.
> > after looking at the man file and searching for recu it seems it recurses
> > by default unless I am reading it wrong.
> > I tried the uniq command but:
> >
> >   uniq -c -d -w list.of.files
> >   uniq: list.of.files: invalid number of bytes to compare
> >
> > isn't uniq used to find the differences between two files? I have
> > a very rudimentary understanding of linux so I'm sure I'm wrong
> >
> > all the files in list.of.files are invisible files. (prefaced with a
> > period))
> > and isn't there a way to sort things depending on their column (column1
> > md5sum, column2 file name)
> >
> > On Mon, Sep 30, 2024 at 2:56 AM Rusty Carruth via PLUG-discuss <
> > plug-discuss@lists.phxlinux.org> wrote:
> >
> >> On 9/28/24 21:06, Michael via PLUG-discuss wrote:
> >>> About a year ago I messed up by accidently copying a folder  with other
> >>> folders into another folder. I'm running out of room and need to find
> >> that
> >>> directory tree and get rid of it. All I know for certain is that it is
> >>> somewhere in my home directory. I THINK it is my pictures directory
> with
> >>> ARW files.
> >>> chatgpt told me to use fdupes but it told me to use an exclude option
> >>> (which I found out it doesn't have) to avoid config files (and I was
> >>> planning on adding to that as I discovered other stuff I didn't want).
> >> then
> >>> it told me to use find but I got an error which leads me to believe it
> >>> doesn't know what it's talking about!
> >>> coul;d someone help me out?
> >>>
> >> First, someone said you need to run updatedb before running find.  No,
> >> sorry, updatedb is for using locate, not find.  Find actively walks the
> >> directory tree.  Locate searches the text (I think) database built by
> >> updatedb.
> >>
> >>
> >> Ok, now to answer the question.  I've got a similar situation, but in
> >> spades.  Every time I did a backup, I did an entire copy of everything,
> >> so I've got ... oh, 10, 20, 30 copies of many things. I'm working on
> >> scripts to help reduce that, but for now doing it somewhat manually, I
> >> suggest the following command:
> >>
> >>
> >> cd (the directory of interest, possibly your home dir) ; find . -type f
> >> -print0 | xargs -0 md5sum | sort > list.of.files
> >>
> >> this will create a list of files, sorted by their md5sum.  If you want
> >> to be lazy and not search that file for duplicate md5sums, consider
> >> uniq.  Like this:
> >>
> >> uniq -c -d -w list.of.files
> >>
> >>
> >> This will print the list of files which are duplicates.  For example,
> >> out of a list of 42,279 files in a certain directory on my computer,
> >> here's the result:
> >>
> >>         2 73d249df037f6e63022e5cfa8d0c959b
> >>
> >>
> _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160321-223138.png
> >>         5 9b162ac35214691461cc0f0104fb91ce
> >> _files/melissa/Documents/EPHESUS/Office Stuff/SPD/SPD SUMMER 2016
> (1).pdf
> >>         3 b396af67f2cd75658397efd878a01fb8
> >> _files/dads_zipdisks/2003-1/CLASS at VBC Sp-03/CLASS BKUP - Music
> >> Reading & Sight Singing Class/C  & D Major & Minor Scales & Chords.mct
> >>         2 cd83094e0c4aeb9128806b5168444578
> >>
> >>
> _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160318-222051.png
> >>         2 d1a5a1bec046cc85a3a3fd53a8d5be86
> >>
> >>
> _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160410-145331.png
> >>         2 fa681c54a2bd7cfa590ddb8cf6ca1cea
> >>
> >>
> _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160312-113340.png
> >>
> >> Originally the _files directory had MANY duplicates, now I've managed to
> >> get that down to the above list...
> >>
> >> Anyway, there you go.  Happy scripting.
> >>
> >> ---------------------------------------------------
> >> PLUG-discuss mailing list: PLUG-discuss@lists.phxlinux.org
> >> To subscribe, unsubscribe, or to change your mail settings:
> >> https://lists.phxlinux.org/mailman/listinfo/plug-discuss
> >>
> >
> >
> > ---------------------------------------------------
> > PLUG-discuss mailing list: PLUG-discuss@lists.phxlinux.org
> > To subscribe, unsubscribe, or to change your mail settings:
> > https://lists.phxlinux.org/mailman/listinfo/plug-discuss
> ---------------------------------------------------
> PLUG-discuss mailing list: PLUG-discuss@lists.phxlinux.org
> To subscribe, unsubscribe, or to change your mail settings:
> https://lists.phxlinux.org/mailman/listinfo/plug-discuss
>
---------------------------------------------------
PLUG-discuss mailing list: PLUG-discuss@lists.phxlinux.org
To subscribe, unsubscribe, or to change your mail settings:
https://lists.phxlinux.org/mailman/listinfo/plug-discuss

Reply via email to