[gentoo-user] [OT] Question about duplicate lines in file
Hi folks, I have batched a bunch of servers in my hosts file to block, for ads and all that crap. I got them from several different places, some I have found too, and am sure there are dups in there, same server but pasted from several sources. I am not a programer at all and don't even really know what to search for. I would like to remove the duplicate entries and then put them in alphabetical order if I could. I would gladly then make this available if someone wanted to host it. I don't have a place to host it. Oh, there is 15,000 entries in my hosts file. O_O Could someone tell me how this is done? May even learn something here. If I can do this, I'm sure I will. Thanks. Dale :-) :-) -- gentoo-user@gentoo.org mailing list
Re: [gentoo-user] [OT] Question about duplicate lines in file
On Tuesday, 13 June 2006 2:12, Teresa and Dale wrote: Hi folks, I have batched a bunch of servers in my hosts file to block, for ads and all that crap. I got them from several different places, some I have found too, and am sure there are dups in there, same server but pasted from several sources. I am not a programer at all and don't even really know what to search for. I would like to remove the duplicate entries and then put them in alphabetical order if I could. I would gladly then make this available if someone wanted to host it. I don't have a place to host it. Oh, there is 15,000 entries in my hosts file. O_O Could someone tell me how this is done? May even learn something here. If I can do this, I'm sure I will. Thanks. Dale :-) :-) 'uniq' and 'sort' should do what you're after, check out the man pages. -- Raymond Lewis Rebbeck -- gentoo-user@gentoo.org mailing list
Re: [gentoo-user] [OT] Question about duplicate lines in file
Raymond Lewis Rebbeck wrote: On Tuesday, 13 June 2006 2:12, Teresa and Dale wrote: Hi folks, I have batched a bunch of servers in my hosts file to block, for ads and all that crap. I got them from several different places, some I have found too, and am sure there are dups in there, same server but pasted from several sources. I am not a programer at all and don't even really know what to search for. I would like to remove the duplicate entries and then put them in alphabetical order if I could. I would gladly then make this available if someone wanted to host it. I don't have a place to host it. Oh, there is 15,000 entries in my hosts file. O_O Could someone tell me how this is done? May even learn something here. If I can do this, I'm sure I will. Thanks. Dale :-) :-) 'uniq' and 'sort' should do what you're after, check out the man pages. Thanks, read the man page, it was short so it didn't take long. I tried this: uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort It doesn't look like it did anything but copy the same thing over. There are only 2 lines missing. Does spaces count? Some put in a lot of spaces between the localhost and the web address. Maybe that has a affect?? Thanks for the help. I had never seen that command before. I had heard of sort, never used it though. I do have those on my desktop. I'm playing with copies instead of my real hosts file. Thanks again. Dale :-) :-) -- gentoo-user@gentoo.org mailing list
Re: [gentoo-user] [OT] Question about duplicate lines in file
On 6/12/06, Teresa and Dale [EMAIL PROTECTED] wrote: Thanks, read the man page, it was short so it didn't take long. I tried this: uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort I think that you need to run sort on the file first, then uniq. HTH, Matt -- gentoo-user@gentoo.org mailing list
Re: [gentoo-user] [OT] Question about duplicate lines in file
On Mon, 12 Jun 2006 12:19:46 -0500, Teresa and Dale wrote: uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort uniq only removes consecutive duplicate line, you need to use sort first sort file | uniq newfile or, possibly, depending on the format of your file sort -u file newfile -- Neil Bothwick Few women admit their age. Few men act theirs. signature.asc Description: PGP signature
Re: [gentoo-user] [OT] Question about duplicate lines in file
On Tuesday, 13 June 2006 2:49, Teresa and Dale wrote: Raymond Lewis Rebbeck wrote: On Tuesday, 13 June 2006 2:12, Teresa and Dale wrote: Hi folks, I have batched a bunch of servers in my hosts file to block, for ads and all that crap. I got them from several different places, some I have found too, and am sure there are dups in there, same server but pasted from several sources. I am not a programer at all and don't even really know what to search for. I would like to remove the duplicate entries and then put them in alphabetical order if I could. I would gladly then make this available if someone wanted to host it. I don't have a place to host it. Oh, there is 15,000 entries in my hosts file. O_O Could someone tell me how this is done? May even learn something here. If I can do this, I'm sure I will. Thanks. Dale :-) :-) 'uniq' and 'sort' should do what you're after, check out the man pages. Thanks, read the man page, it was short so it didn't take long. I tried this: uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort It doesn't look like it did anything but copy the same thing over. There are only 2 lines missing. Does spaces count? Some put in a lot of spaces between the localhost and the web address. Maybe that has a affect?? Thanks for the help. I had never seen that command before. I had heard of sort, never used it though. I do have those on my desktop. I'm playing with copies instead of my real hosts file. Thanks again. Dale :-) :-) Yes the spaces matter, you could possibly use 'tr' to turn all repeated spaces into a single space. $ tr -s ' ' filename That should do it, then you can pipe it through uniq and sort and do whatever else you want with it. -- Raymond Lewis Rebbeck -- gentoo-user@gentoo.org mailing list
Re: [gentoo-user] [OT] Question about duplicate lines in file
On Monday 12 June 2006 18:19, Teresa and Dale wrote: Thanks, read the man page, it was short so it didn't take long. I tried this: sort would be more appropriate. I don't believe uniq will find matches anywhere in the file, i.e. 192 195 192 wouldn't get shortened, but 192 192 195 would. -- Mike Williams -- gentoo-user@gentoo.org mailing list