[gentoo-user] [OT] Question about duplicate lines in file

2006-06-12 Thread Teresa and Dale
Hi folks,

I have batched a bunch of servers in my hosts file to block, for ads and
all that crap.  I got them from several different places, some I have
found too, and am sure there are dups in there, same server but pasted
from several sources.  I am not a programer at all and don't even really
know what to search for.  I would like to remove the duplicate entries
and then put them in alphabetical order if I could.  I would gladly then
make this available if someone wanted to host it.  I don't have a place
to host it. 

Oh, there is 15,000 entries in my hosts file.  O_O

Could someone tell me how this is done?  May even learn something here. 
If I can do this, I'm sure I will. 

Thanks.

Dale
:-)  :-)
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] [OT] Question about duplicate lines in file

2006-06-12 Thread Raymond Lewis Rebbeck
On Tuesday, 13 June 2006 2:12, Teresa and Dale wrote:
 Hi folks,

 I have batched a bunch of servers in my hosts file to block, for ads and
 all that crap.  I got them from several different places, some I have
 found too, and am sure there are dups in there, same server but pasted
 from several sources.  I am not a programer at all and don't even really
 know what to search for.  I would like to remove the duplicate entries
 and then put them in alphabetical order if I could.  I would gladly then
 make this available if someone wanted to host it.  I don't have a place
 to host it.

 Oh, there is 15,000 entries in my hosts file.  O_O

 Could someone tell me how this is done?  May even learn something here.
 If I can do this, I'm sure I will.

 Thanks.

 Dale

 :-)  :-)

'uniq' and 'sort' should do what you're after, check out the man pages.

-- 
Raymond Lewis Rebbeck
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] [OT] Question about duplicate lines in file

2006-06-12 Thread Teresa and Dale
Raymond Lewis Rebbeck wrote:

On Tuesday, 13 June 2006 2:12, Teresa and Dale wrote:
  

Hi folks,

I have batched a bunch of servers in my hosts file to block, for ads and
all that crap.  I got them from several different places, some I have
found too, and am sure there are dups in there, same server but pasted
from several sources.  I am not a programer at all and don't even really
know what to search for.  I would like to remove the duplicate entries
and then put them in alphabetical order if I could.  I would gladly then
make this available if someone wanted to host it.  I don't have a place
to host it.

Oh, there is 15,000 entries in my hosts file.  O_O

Could someone tell me how this is done?  May even learn something here.
If I can do this, I'm sure I will.

Thanks.

Dale

:-)  :-)



'uniq' and 'sort' should do what you're after, check out the man pages.

  



Thanks, read the man page, it was short so it didn't take long.  I tried
this:

uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort

It doesn't look like it did anything but copy the same thing over. 
There are only 2 lines missing.  Does spaces count?  Some put in a lot
of spaces between the localhost and the web address.  Maybe that has a
affect??

Thanks for the help.  I had never seen that command before.  I had heard
of sort, never used it though.  I do have those on my desktop.  I'm
playing with copies instead of my real hosts file.

Thanks again.

Dale
:-)  :-)
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] [OT] Question about duplicate lines in file

2006-06-12 Thread Matthew Cline

On 6/12/06, Teresa and Dale [EMAIL PROTECTED] wrote:


Thanks, read the man page, it was short so it didn't take long.  I tried
this:

uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort



I think that you need to run sort on the file first, then uniq.

HTH,

Matt
--
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] [OT] Question about duplicate lines in file

2006-06-12 Thread Neil Bothwick
On Mon, 12 Jun 2006 12:19:46 -0500, Teresa and Dale wrote:

 uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort

uniq only removes consecutive duplicate line, you need to use sort first

sort file | uniq newfile

or, possibly, depending on the format of your file

sort -u file newfile


-- 
Neil Bothwick

Few women admit their age. Few men act theirs.


signature.asc
Description: PGP signature


Re: [gentoo-user] [OT] Question about duplicate lines in file

2006-06-12 Thread Raymond Lewis Rebbeck
On Tuesday, 13 June 2006 2:49, Teresa and Dale wrote:
 Raymond Lewis Rebbeck wrote:
 On Tuesday, 13 June 2006 2:12, Teresa and Dale wrote:
 Hi folks,
 
 I have batched a bunch of servers in my hosts file to block, for ads and
 all that crap.  I got them from several different places, some I have
 found too, and am sure there are dups in there, same server but pasted
 from several sources.  I am not a programer at all and don't even really
 know what to search for.  I would like to remove the duplicate entries
 and then put them in alphabetical order if I could.  I would gladly then
 make this available if someone wanted to host it.  I don't have a place
 to host it.
 
 Oh, there is 15,000 entries in my hosts file.  O_O
 
 Could someone tell me how this is done?  May even learn something here.
 If I can do this, I'm sure I will.
 
 Thanks.
 
 Dale
 
 :-)  :-)
 
 'uniq' and 'sort' should do what you're after, check out the man pages.

 Thanks, read the man page, it was short so it didn't take long.  I tried
 this:

 uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort

 It doesn't look like it did anything but copy the same thing over.
 There are only 2 lines missing.  Does spaces count?  Some put in a lot
 of spaces between the localhost and the web address.  Maybe that has a
 affect??

 Thanks for the help.  I had never seen that command before.  I had heard
 of sort, never used it though.  I do have those on my desktop.  I'm
 playing with copies instead of my real hosts file.

 Thanks again.

 Dale

 :-)  :-)

Yes the spaces matter, you could possibly use 'tr' to turn all repeated spaces 
into a single space.

$ tr -s ' '  filename

That should do it, then you can pipe it through uniq and sort and do whatever 
else you want with it.

-- 
Raymond Lewis Rebbeck
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] [OT] Question about duplicate lines in file

2006-06-12 Thread Mike Williams
On Monday 12 June 2006 18:19, Teresa and Dale wrote:
 Thanks, read the man page, it was short so it didn't take long.  I tried
 this:

sort would be more appropriate. I don't believe uniq will find matches 
anywhere in the file, i.e.

192
195
192

wouldn't get shortened, but

192
192
195

would.

-- 
Mike Williams

-- 
gentoo-user@gentoo.org mailing list