Hi Dave,


I have done a thing similar to this using a combination of MS Word and MS 
Excel. It's a little wonky, but it saves needing to figure out any heavy coding 
or new software. The way I did this is sentence by sentence, rather than line 
by line, but you could probably figure out a way to make this actually line by 
line if you needed to.



Here are the steps I used:

  1.  Copy and paste manuscript 1 into a new document in MS Word. I used "Keep 
Text Only" as the paste setting, since I didn't need to compare formatting.
  2.  In MS Word, I used Find and Replace to double each paragraph break, then 
turn each sentence into a separate paragraph. To double the paragraph breaks:

Find: ^p

Replace: ^p^p

Replace all

(I did this part so that I would be able to see where new paragraph actually 
start in the Excel comparison)

To turn each sentence into a separate paragraph:

Find: . (including the space after the period)

Replace: . ^p

Replace all

If your manuscript has any tabs in it, you will also want to get rid of them at 
this point, since they will mess up the Excel steps. To do this:

Find: ^t

Replace: (leave completely blank)

Replace all

If your manuscript has any tables in it, you will also want to remove them from 
this version, since they will mess up the Excel steps.

  1.  Copy and paste the reformatted manuscript 1 into a new Excel spreadsheet 
in column A. Each new sentence will be its own separate cell in a single 
column. There will be a blank cell between each paragraph.
  2.  Repeat steps 1 and 2 with manuscript 2. Then copy and paste manuscript 2 
into the same Excel spreadsheet in column B, next to manuscript 1.
  3.  In column C, you are going to create a formula that tells you if the text 
in columns A and B are an exact match. To do this, select row 1 column C, then 
enter:

=IF(EXACT(A2, B2), "Same", "Different")

After entering this formula in row 1, click the bottom right corner of this 
cell and drag to the bottom of the two manuscript columns to make all rows 
compare the two texts. You can now see how manuscript 1 and manuscript 2 
compare sentence-by-sentence in column C.



Some other things that can make this easier:

To make a quick visual comparison of which sentences are the same or different, 
you can also use Conditional Formatting to make column C change color based on 
whether it says "Same" or "Different." If you'd like some detailed instructions 
on how to do this, LMK.

If whole sentences have been added or removed, it will make the lines after 
that not line up. So, you might want to shift some cells up or down to make 
them line up again.



Hope this helps.



Best wishes,

Nora Weston, MSLS
Access & Reference Services Librarian (Contractor)
NIEHS Library - Your Partner in Research
[email protected]<mailto:[email protected]>
984.287.3603
Pronouns: she/her/hers






-----Original Message-----
From: Code for Libraries 
[email protected]<mailto:[email protected]> On Behalf Of David 
Erlandson
Sent: Tuesday, June 28, 2022 8:51 AM
To: [email protected]<mailto:[email protected]>
Subject: [EXTERNAL] [CODE4LIB] Variora/Differences in Manuscripts



Hi all,



I have a colleague who is looking to track changes in text of a manuscript that 
has 4 revisions. Apparently there are pretty major changes to the content and 
it would be great to identify them.



I was thinking through tools I'm familiar with (generally line by line

comparisons) but that would seem to have the pitfall of an early large revision 
throwing off the comparison for the rest of the text. Another silly thought was 
to start up a local wiki instance and overlay each version; use the built in 
compare tools... Has anyone worked on a project like this?  Or are there any 
tools built and ready to go? Any guidance would be appreciated.



Thanks,

Dave



_________________________________________________________________________

David Erlandson | Metadata Analyst | Rice University - Fondren Library

Email: [email protected]<mailto:[email protected]> | Voice: 
713.348.3727 | Fax: 713.348.5862

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize the sender and are confident the 
content is safe.

Reply via email to