Re: how to find similar lines [56215]

Vlastimil Brom Tue, 22 Mar 2011 17:12:24 -0700

djehres:
--------------------------------------------------------------------------------
EXEC MANUFACTURERS('09094','ABC INC','UNITED STATES');
EXEC MANUFACTURERS('09094','ABC, INC.','UNITED STATES');


I have a file with hundreds of entries like this.  Not all are '09094', just
these two, but there are others with different numbers and text. These two are
actually duplicates because of the number '09094'.

How do I FIND these and LIST them out to the search results?

I can FIND the first , but I want both lines LISTed out to the search results.

thanks.
--------------------------------------------------------------------------------


A possible semi-manual way could be to use two temporary files containing only
the starting part parts up to the id number: 
use the "Copy" function in the find dialog, check [x] regular expression and use
the search phrase e.g. 
EXEC MANUFACTURERS\('\d+',

make another copy from the result list and sort both texts (Edit: Sort) 
the first one without removing duplicates, the another one WITH [x] remove
duplicates checked.

After that you can compare both list (activate one of them and right click on
the tab of the another and select "Text diff with active tab"
You should see the deleted duplicates highlighted (or you can use the function
of the diff window "results processing: missing lines into new file)

After that you have to manually go through the original file with complete lines
and delete the unwanted lines for the IDs found.

It would be easier, if the original data can be sorted too.

(Unfortunately, the advanced setting of the sort dialog: 
[ ] specify column
doesn't seem to work for "remove duplicates"; otherwise, only sorting on columns
e.g. 1-27 would be enough for your task.) 

Well it is a rather complicated process, but if there are enough entries, it
might be faster, than doing al the work manually...


hth,
  vbr

-- 
<http://forum.pspad.com/read.php?2,56213,56215>
PSPad freeware editor http://www.pspad.com

Re: how to find similar lines [56215]

Odpovedet emailem