Source: dvdbackup
Version: 0.4.2-4
Severity: wishlist
Tags: patch

Hi,

I'm currently in the process of digitizing my rather large DVD
collection with dvdbackup.

When encountering errors, dvdbackup currently has three options, each
with their own downsides:

- "-r a" causes dvdbackup to abort. No data is lost in the conversion
  process, but this is usually also not very helpful.
- "-r b" causes dvdbackup to skip ahead to the first block after the one
  that first failed. This loses as little data as possible, but since
  trying to read an unreadable block takes several seconds, doing so
  will cause the attempt to take forever if there are many unreadable
  blocks.
- The default "-r m", causes dvdbackup to skip the rest of the region it
  was trying to read. Since dvdbackup tries to read 512 blocks at a
  time, in the worst case this may mean it will skip over 511 blocks.
  Often the number of unreadable blocks isn't that large, so this throws
  away more data than is necessary.

As a middle ground, I've hacked up a fourth option that tries to do a
binary search for the last block that can be read successfully. It isn't
perfect (it may be confused over multiple contiguous areas), but it
should be faster than the "-r b" mode while not losing as much data as
the "-r m" mode. At the same time, since it will read multiple blocks
per failure, it will still be slower than the "-r m" mode.

Patch series are attached. They should apply as-is to git HEAD.

-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 
'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.3.0-1-amd64 (SMP w/8 CPU cores)
Locale: LANG=nl_BE.UTF-8, LC_CTYPE=nl_BE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
>From e3c17b33b828087008334af2131775078cc3a9ed Mon Sep 17 00:00:00 2001
From: Wouter Verhelst <wou...@debian.org>
Date: Sun, 10 Jan 2016 19:11:08 +0100
Subject: [PATCH 1/4] Use lseek() rather than writing zeroes

The result will still be zeroes, but the file will be sparsely created.
This has the advantage that we use less space (if there are many failed
reads), but more importantly, also allows to go backwards in the file if
needs be.

To make this less ugly, move the operation also after the
offset/remaining reassignments.
---
 src/dvdbackup.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/dvdbackup.c b/src/dvdbackup.c
index 5888ce5..924d60e 100644
--- a/src/dvdbackup.c
+++ b/src/dvdbackup.c
@@ -887,14 +887,15 @@ static int DVDCopyBlocks(dvd_file_t* dvd_file, int destination, int offset, int
 				break;
 			}
 
-			if (write(destination, buffer_zero, numBlanks * DVD_VIDEO_LB_LEN) != numBlanks * DVD_VIDEO_LB_LEN) {
-				fprintf(stderr, _("Error writing %s (padding)\n"), filename);
-				return 1;
-			}
-
 			/* pretend we read what we padded */
 			offset += numBlanks;
 			remaining -= numBlanks;
+
+			if (lseek(destination, (off_t)(offset * DVD_VIDEO_LB_LEN), SEEK_SET)) {
+				fprintf(stderr, _("Error seeking %s\n"), filename);
+				return 1;
+			}
+
 		}
 
 		if(progress) {
-- 
2.7.0.rc3

>From 4d4dbfe30a047f2d34a74f961829c3398e617894 Mon Sep 17 00:00:00 2001
From: Wouter Verhelst <wou...@debian.org>
Date: Sun, 10 Jan 2016 19:37:20 +0100
Subject: [PATCH 2/4] Implement binary search for skipping over bad data

Currently, dvdbackup has three options:
- Abort when encountering errors. This is not helpful.
- When encountering errors, skip over whatever we were trying to read
  and move on to the next part. If the amount of unreadable data (in
  this part of the DVD) is small, and we were at the beginning of the
  buffer, this will throw away a lot of data.
- When encountering errors, assume it's just the current block that
  fails and try to read the next. Since most DVD readers will already
  retry reading blocks several times, which can take minutes with some
  hardware, this mode takes forever if there are a large amount of
  unreadable blocks.

This patch adds a binary search mode to reduce both the number of blocks
read as well as the amount of data lost.

It may still lose data if the buffer we're trying to read has two
contiguous areas of unreadable data. I've decided not to care about
that.
---
 src/dvdbackup.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 src/dvdbackup.h |  3 ++-
 2 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/src/dvdbackup.c b/src/dvdbackup.c
index 924d60e..aeef2fb 100644
--- a/src/dvdbackup.c
+++ b/src/dvdbackup.c
@@ -41,6 +41,8 @@
 #include <dvdread/dvd_reader.h>
 #include <dvdread/ifo_read.h>
 
+#include <errno.h>
+
 
 #define MAXNAME 256
 
@@ -817,6 +819,9 @@ static int DVDCopyBlocks(dvd_file_t* dvd_file, int destination, int offset, int
 	float totalMiB = (float)(total) / 512.0f; // total size in [MiB]
 	int to_read = BUFFER_SIZE;
 	int act_read; /* number of buffers actually read */
+	int binary_last_bad = 0;
+	int binary_last_good = 0;
+	int binary_orig_start = 0;
 
 	/* Write buffer */
 	unsigned char buffer[BUFFER_SIZE * DVD_VIDEO_LB_LEN];
@@ -885,17 +890,80 @@ static int DVDCopyBlocks(dvd_file_t* dvd_file, int destination, int offset, int
 				numBlanks = to_read - act_read;
 				fprintf(stderr, _("padding %d blocks\n"), numBlanks);
 				break;
+
+			case STRATEGY_SEARCH_BINARY:
+				/* There are three possibilities of us
+				 * getting here:
+				 * - We're here the first time. In that
+				 *   case, we should try to read the last
+				 *   block of to_read to see if any part
+				 *   of the blocks we were trying to
+				 *   read can actually be read and start
+				 *   the binary search.
+				 * - We're here the second time. In that
+				 *   case, we assume they're all
+				 *   unreadable and skip to the next
+				 *   part.
+				 * - In all other cases, we verify if
+				 *   the just-failed block is one below
+				 *   the most recent successful block.
+				 *   If so, we skip ahead to the end of the
+				 *   part. If not, we skip to a point halfway
+				 *   between the last successful point and this
+				 *   point and try again.
+				 */
+				if(binary_last_bad == 0) {
+					// first time
+					binary_last_bad = binary_orig_start = offset;
+					numBlanks = to_read - act_read - 1;
+					to_read = 1;
+					fprintf(stderr, _("trying %d blocks ahead\n"), numBlanks);
+				} else if(binary_last_good < binary_last_bad) {
+					// second time
+					fprintf(stderr, _("moving on, padding %d blocks\n"), offset - binary_last_bad);
+					binary_last_bad = 0;
+					to_read = BUFFER_SIZE;
+				} else {
+					binary_last_bad = offset;
+					if(binary_last_bad + 1 == binary_last_good) {
+						fprintf(stderr, _("first readable block is %d, moving on (total skipped: %d)\n"), offset + 1, offset + 1 - binary_orig_start);
+						binary_last_bad = binary_last_good = 0;
+						to_read = BUFFER_SIZE;
+						numBlanks = 1;
+					} else {
+						numBlanks = (binary_last_good - offset) / 2;
+						fprintf(stderr, _("trying %d blocks ahead (gap size: %d)\n"), numBlanks, binary_last_good - offset);
+					}
+				}
 			}
 
 			/* pretend we read what we padded */
 			offset += numBlanks;
 			remaining -= numBlanks;
 
-			if (lseek(destination, (off_t)(offset * DVD_VIDEO_LB_LEN), SEEK_SET)) {
-				fprintf(stderr, _("Error seeking %s\n"), filename);
+			if (lseek(destination, (off_t)offset * DVD_VIDEO_LB_LEN, SEEK_SET) < 0) {
+				fprintf(stderr, _("Error seeking %s: %s\n"), filename, strerror(errno));
 				return 1;
 			}
-
+		} else {
+			if(errorstrat == STRATEGY_SEARCH_BINARY && binary_last_bad != 0) {
+				offset -= act_read;
+				binary_last_good = offset;
+				if(binary_last_bad + 1 == binary_last_good) {
+					fprintf(stderr, _("first readable block is %d, moving on (total skipped: %d)\n"), offset, offset - binary_orig_start);
+					binary_last_bad = binary_last_good = 0;
+					to_read = BUFFER_SIZE;
+				} else {
+					int search = (binary_last_good - binary_last_bad) / 2;
+					offset -= search;
+					remaining += search;
+					fprintf(stderr, _("success, trying %d blocks back (gap size: %d)\n"), search, binary_last_good - binary_last_bad);
+					if(lseek(destination, (off_t)offset * DVD_VIDEO_LB_LEN, SEEK_SET) < 0) {
+						fprintf(stderr, _("Error seeking %s: %s\n"), filename, strerror(errno));
+						return 1;
+					}
+				}
+			}
 		}
 
 		if(progress) {
diff --git a/src/dvdbackup.h b/src/dvdbackup.h
index d981403..a303ef0 100644
--- a/src/dvdbackup.h
+++ b/src/dvdbackup.h
@@ -32,7 +32,8 @@ extern int progress;
 typedef enum {
 	STRATEGY_ABORT,
 	STRATEGY_SKIP_BLOCK,
-	STRATEGY_SKIP_MULTIBLOCK
+	STRATEGY_SKIP_MULTIBLOCK,
+	STRATEGY_SEARCH_BINARY,
 } read_error_strategy_t;
 
 int DVDDisplayInfo(dvd_reader_t*, char*);
-- 
2.7.0.rc3

>From ea07ddb445461907e1e44d546c8429cd57795270 Mon Sep 17 00:00:00 2001
From: Wouter Verhelst <wou...@debian.org>
Date: Sun, 10 Jan 2016 19:48:53 +0100
Subject: [PATCH 3/4] Allow to select the binary search mode from the command
 line

---
 src/main.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/main.c b/src/main.c
index 8369968..ed5a57e 100644
--- a/src/main.c
+++ b/src/main.c
@@ -101,8 +101,9 @@ Print a friendly, customizable greeting.\n"), stdout); */
   -n, --name=NAME          set the title (useful if autodetection fails)\n\
   -a, --aspect=0           to get aspect ratio 4:3 instead of 16:9 if both are\n\
                            present\n\
-  -r, --error={a,b,m}      select read error handling: a=abort, b=skip block,\n\
-                           m=skip multiple blocks (default)\n\
+  -r, --error={a,b,m,B}    select read error handling: a=abort, b=skip block,\n\
+			   m=skip multiple blocks (default),B=do a binary\n\
+			   search for the next readable block\n\
   -p, --progress           print progress information while copying VOBs\n\n"));
 
 	printf(_("\
@@ -284,6 +285,8 @@ int main(int argc, char* argv[]) {
 			errorstrat=STRATEGY_SKIP_BLOCK;
 		} else if(errorstrat_temp[0]=='m') {
 			errorstrat=STRATEGY_SKIP_MULTIBLOCK;
+		} else if(errorstrat_temp[0]=='B') {
+			errorstrat=STRATEGY_SEARCH_BINARY;
 		} else {
 			print_help();
 			exit(1);
-- 
2.7.0.rc3

>From 3a3538a9d78d59054705dd26315aa6d5721deef3 Mon Sep 17 00:00:00 2001
From: Wouter Verhelst <wou...@debian.org>
Date: Sun, 10 Jan 2016 19:49:27 +0100
Subject: [PATCH 4/4] Documentation update

---
 man/dvdbackup.1 | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/man/dvdbackup.1 b/man/dvdbackup.1
index 85118f7..175484e 100644
--- a/man/dvdbackup.1
+++ b/man/dvdbackup.1
@@ -76,11 +76,12 @@ print more information about progress
 .B \-a 0, \-\-aspect=0
 to get aspect ratio 4:3 instead of 16:9 if both are present
 .TP
-.B  \-r {a,b,m}, \-\-error={a,b,m}
+.B  \-r {a,b,m,B}, \-\-error={a,b,m,B}
 select read error handling:
 a=abort,
 b=skip block,
-m=skip multiple blocks (default)
+m=skip multiple blocks (default),
+B=do a binary search
 .TP
 .B \-p, \-\-progress
 print progress information while copying VOBs
-- 
2.7.0.rc3

Reply via email to