Re: Scanning incomplete and updated documents

2020-04-17 Thread Frank McConnell via cctalk
On Apr 17, 2020, at 9:02, Alan Perry via cctalk wrote:
> As I noted, the two sets of appendices are mostly completely different. There 
> are 9 appendices that look like they are from the primary document and 5 that 
> look like they are from inserted pages. Both sets start numbering from "A". 
> There is one title shared between the two sets.
> 
> But the really interesting thing is that the copyright date on the "primary" 
> document is AFTER the date in the footers of the "inserted" pages.

Do not underestimate the powers of humans to make errors when creating or 
applying updates.

> FYI - the document is Computervision CADDStation System Software Installation.
> 
> So, any recommendations on what I should do?
> 
> Also, any recommendations on a software tool to slice and dice PDFs that is 
> inexpensive?

Depends what you want to do.  On a Mac, I use Preview which is the default PDF 
viewer; it allows me to cut, copy and paste scanned pages from one PDF to 
another.

I also do some things with Perl and CAM::PDF which makes writing filter-like 
programs that do page-level operations fairly easy, e.g. I have one that 
deletes the odd-numbered pages from a stdin-supplied PDF and writes the 
resulting PDF to stdout and I think it’s about 12 lines total.  Anyone care to 
guess what my use case for this is?

-Frank McConnell





Re: Scanning incomplete and updated documents

2020-04-17 Thread J. David Bryan via cctalk
On Friday, April 17, 2020 at 9:02, Alan Perry via cctech wrote:

> But the really interesting thing is that the copyright date on the 
> "primary" document is AFTER the date in the footers of the "inserted"
> pages.

Sometimes an update contains a replacement title page (containing the 
update date), and sometimes updates supersede other updates for the same 
manual.  I've seen ones that have different change dates at different 
points in the manual, with the title page reflecting the date of the last 
update.  Maybe you have something like that.


> So, any recommendations on what I should do?

If you can't clearly determine whether the pages you have belong to one 
manual print date or two, then it certainly wouldn't hurt simply to include 
both sets of pages in one PDF.  My personal goals for scanning, in 
decreasing priority, are:

 1. Preserve the exact representation of a manual.

 2. Preserve the information in a manual in a usable format.

 3. Preserve the pages of a manual.

Even if you have to fall back to #3, something might turn up later that 
would allow you to go back and reorganize the pages into a more useful 
arrangement or into an original format.

  -- Dave



Re: Scanning incomplete and updated documents

2020-04-17 Thread J. David Bryan via cctalk
On Friday, April 17, 2020 at 18:06, Antonio Carlini via cctech wrote:

> pdftk works well for managing pages and PDFs.

I'll second that.  It's been invaluable for combining PDFs (such as when a 
supplier insists on delivering its general catalog in individual sections), 
as well as extracting subsets of pages.

  https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/

I find that the free one works fine for my applications.

  -- Dave



Re: Scanning incomplete and updated documents

2020-04-17 Thread J. David Bryan via cctalk
On Thursday, April 16, 2020 at 23:53, Frank McConnell wrote:

> The folks who did the 3000 manuals didn´t do that very often.  What we
> got were almost always update packets that simply replaced pages in
> their manuals.

To be fair, the 1000 (DSD) manuals were updated by replacement pages also.
What was odd and frustrating about the LSD folks was that a given update
might have two replacement pages and fifteen instructions to modify
existing pages.

Interesting that each division seemed to have its own rules.


> The 3000 folks I think had to make some effort to get beyond paste-up
> to where they were doing their manuals and updates in something, and
> it didn´t happen until the not-so-early 1980s.

The pre-1980 1000 software manuals, and pretty much all of the hardware
manuals, appeared to be typeset.  Starting around 1980 through 1986 or so,
the software manuals had typeset headings and chain-line-printed body text.
After that, they appear to have been mastered on a laser printer.


> Sometimes I thought the thing was TDP/3000 and the camera-ready copy
> came out of a 2680A or 2688A printer.

I recall receiving one manual that was actually printed on a 2680...on
fanfold blank letter-size paper.  Had to de-perf it and punch holes for a
binder.

  -- Dave



Re: Scanning incomplete and updated documents

2020-04-17 Thread Antonio Carlini via cctalk

On 17/04/2020 17:02, Alan Perry via cctalk wrote:


Looking at the document, my case looks more interesting.

As I noted, the two sets of appendices are mostly completely 
different. There are 9 appendices that look like they are from the 
primary document and 5 that look like they are from inserted pages. 
Both sets start numbering from "A". There is one title shared between 
the two sets.


But the really interesting thing is that the copyright date on the 
"primary" document is AFTER the date in the footers of the "inserted" 
pages.


FYI - the document is Computervision CADDStation System Software 
Installation.


So, any recommendations on what I should do?



I have no particular recommendation as such, other than to scan every 
page and to keep them all together: if they started life as one folder 
then I'd suggest that they should continue that way.


If you do that and a "better organisation" or "standard" appears later, 
it should be easy enough to post-process your PDF to arrange the pages 
to meet that new standard.





Also, any recommendations on a software tool to slice and dice PDFs 
that is inexpensive?



pdftk works well for managing pages and PDFs. I've used pdfshuffler too, 
when I want a "visual" tool.



Antonio



--
Antonio Carlini
anto...@acarlini.com



Re: Scanning incomplete and updated documents

2020-04-17 Thread Alan Perry via cctalk




On 4/16/20 9:32 PM, J. David Bryan via cctech wrote:

On Thursday, April 16, 2020 at 8:52, Alan Perry via cctech wrote:







Should I create two different pdfs with different appendix sections or
create a single pdf with both sets?


Where both old and new pages were present, and where they could be
differentiated clearly, I made a separate PDF for each manual printing.
That is, I'd have two PDFs with the same part number with different print
dates -- one containing the old (original) pages, and the other containing
the new (replacement) pages.  See, for example:

   /pdf/hp/64000/hardware/64161-90901_Jan-1984.pdf
   /pdf/hp/64000/hardware/64161-90901_May-1984.pdf

and the update ("Manual Change Sheet"):

   /pdf/hp/64000/hardware/64161-90901-MCS_May-1984.pdf

...from which the later manual was created at Bitsavers.



Looking at the document, my case looks more interesting.

As I noted, the two sets of appendices are mostly completely different. 
There are 9 appendices that look like they are from the primary document 
and 5 that look like they are from inserted pages. Both sets start 
numbering from "A". There is one title shared between the two sets.


But the really interesting thing is that the copyright date on the 
"primary" document is AFTER the date in the footers of the "inserted" pages.


FYI - the document is Computervision CADDStation System Software 
Installation.


So, any recommendations on what I should do?

Also, any recommendations on a software tool to slice and dice PDFs that 
is inexpensive?


alan


Re: Scanning incomplete and updated documents

2020-04-17 Thread J. David Bryan via cctalk
On Thursday, April 16, 2020 at 23:18, Frank McConnell wrote:

> Sometimes I have come across shrink-wrapped manuals and later updates,
> and scanned them as found.  I wouldn´t want to deny other people the
> opportunity to apply updates to manuals, you know?

Oh yes, especially the ones that would say, "Insert the six paragraphs
below between the third and fourth lines of page 23."  :-)

I used to despise HP when they would send updates that consisted of
instructions to modify existing manual pages instead of sending replacement
pages.  I recall one update, maybe for the HP 64000 logic station mainframe
service manual, that instructed me to change a dozen or so schematics --
simple things, like "replace feedback resistor R23 with the active filter
circuit shown below."  That one got stuck in the front of the outdated
manual, as I just couldn't bring myself to butcher the pages as they
required

When it came to manual updates, the Logic Systems Division was aptly named.

  -- Dave



Re: Scanning incomplete and updated documents

2020-04-17 Thread Frank McConnell via cctalk
On Apr 16, 2020, at 21:32, J. David Bryan via cctech wrote:
> 
> On Thursday, April 16, 2020 at 8:52, Alan Perry via cctech wrote:
> 
>> 1. One document is a software installation manual in a loose leaf
>> binder with other documents. It has a title page, tables of contents,
>> etc., several chapters, and then it gets interesting. It has several
>> appendix sections (starting at A), an index, then more appendix
>> sections (starting at A as well), and then another index. The document
>> title and its font match of the second set of appendix sections and
>> second index matches the table of contents and chapters. 
> 
> I've scanned roughly 450 manuals.  What you describe might be the result of 
> a manual update.  Some updates include replacement pages, with the intent 
> that the replaced pages are discarded.  I've encountered manuals, though, 
> where both the old and new pages were kept, perhaps to retain a record of 
> the changes.

TRVTH.  I used to do exactly this when HP sent updates.  I put replaced
pages at the back of the manual, and usually did not refer to them thereafter.
When the binder filled up, that’s when I might consider discarding them.

HP had the habit of printing the update date and sometimes update number near
the bottom of the updated pages.

>> Should I create two different pdfs with different appendix sections or 
>> create a single pdf with both sets?
> 
> Where both old and new pages were present, and where they could be 
> differentiated clearly, I made a separate PDF for each manual printing.  
> That is, I'd have two PDFs with the same part number with different print 
> dates -- one containing the old (original) pages, and the other containing 
> the new (replacement) pages.  See, for example:

Sometimes I have come across shrink-wrapped manuals and later updates, and
scanned them as found.  I wouldn’t want to deny other people the opportunity
to apply updates to manuals, you know?

-Frank McConnell



Re: Scanning incomplete and updated documents

2020-04-17 Thread Frank McConnell via cctalk
On Apr 16, 2020, at 23:37, J. David Bryan wrote:
> On Thursday, April 16, 2020 at 23:18, Frank McConnell wrote:
> 
>> Sometimes I have come across shrink-wrapped manuals and later updates,
>> and scanned them as found.  I wouldn´t want to deny other people the
>> opportunity to apply updates to manuals, you know?
> 
> Oh yes, especially the ones that would say, "Insert the six paragraphs
> below between the third and fourth lines of page 23."  :-)

The folks who did the 3000 manuals didn’t do that very often.  What we
got were almost always update packets that simply replaced pages in their
manuals.  Once I think I remember getting a sticker to be stuck over the
replaced text on a page.

> I used to despise HP when they would send updates that consisted of
> instructions to modify existing manual pages instead of sending replacement
> pages.  I recall one update, maybe for the HP 64000 logic station mainframe
> service manual, that instructed me to change a dozen or so schematics --
> simple things, like "replace feedback resistor R23 with the active filter
> circuit shown below."  That one got stuck in the front of the outdated
> manual, as I just couldn't bring myself to butcher the pages as they
> required
> 
> When it came to manual updates, the Logic Systems Division was aptly named.

The 3000 folks I think had to make some effort to get beyond paste-up
to where they were doing their manuals and updates in something, and it didn’t
happen until the not-so-early 1980s.  Sometimes I thought the thing was
TDP/3000 and the camera-ready copy came out of a 2680A or 2688A printer.  I
don’t really know how they got there or how it worked, and by the end of the
1980s I think they had moved away from TDP/3000 to something else.

-Frank McConnell



Re: Scanning incomplete and updated documents

2020-04-16 Thread J. David Bryan via cctalk
On Thursday, April 16, 2020 at 8:52, Alan Perry via cctech wrote:

> 1. One document is a software installation manual in a loose leaf
> binder with other documents. It has a title page, tables of contents,
> etc., several chapters, and then it gets interesting. It has several
> appendix sections (starting at A), an index, then more appendix
> sections (starting at A as well), and then another index. The document
> title and its font match of the second set of appendix sections and
> second index matches the table of contents and chapters. 

I've scanned roughly 450 manuals.  What you describe might be the result of 
a manual update.  Some updates include replacement pages, with the intent 
that the replaced pages are discarded.  I've encountered manuals, though, 
where both the old and new pages were kept, perhaps to retain a record of 
the changes.


> Should I create two different pdfs with different appendix sections or 
> create a single pdf with both sets?

Where both old and new pages were present, and where they could be 
differentiated clearly, I made a separate PDF for each manual printing.  
That is, I'd have two PDFs with the same part number with different print 
dates -- one containing the old (original) pages, and the other containing 
the new (replacement) pages.  See, for example:

  /pdf/hp/64000/hardware/64161-90901_Jan-1984.pdf
  /pdf/hp/64000/hardware/64161-90901_May-1984.pdf

and the update ("Manual Change Sheet"):

  /pdf/hp/64000/hardware/64161-90901-MCS_May-1984.pdf

...from which the later manual was created at Bitsavers.


> 2. One document is missing the title page and table of contents.
> Should the pdf just be what I have or should I create those pages for
> the pdf? 

If you know how the title page and TOC should appear, I would add them.  If 
you wish, you could add a note, such as "(Reconstructed)", as a page footer 
on the added pages.  I've reconstructed missing front and back cover pages 
where the manual is part of a family of manuals that all share the same 
basic cover and title page designs.

I think PDFs, ideally, should allow one to reprint an extinct manual in its 
entirety.  So I include the blank pages, covers, inserts, etc. from the 
original when I make mine.


> Thanks, 

You're welcome.

  -- Dave