Summary: Saving viewed source loses spaces in the middle of lines. Description: When viewing HTML source (\), if lines are wider than the terminal window and have spaces in the right place, the spaces are lost when the source is saved with the print functions. This is not losing extra spaces or the conversion of tabs into spaces, but loss of the only space between words, HTML tag attributes, etc. It happens when the last character displayed on a window's line is a space, and the first character on the next line of the display (after the plus sign) is not a space. For an 80 column window, the problem is seen when the HTML source has a space in column 80 and a non-space character in column 81. Depending on where the space is lost, it might just cause spelling errors in text, but also will cause HTML errors when a space is lost in the middle of an HTML tag. If the HTML source has tabs, the problem columns will vary due to tab expansion.
Environment: The problem was first seen in Lynx 2.8.7dev.4 installed
on Debian Etch "Linux 2.6.18-4-686" with Debian's "lynx-cur" package.
The problem was reproduced on 2.8.7dev.8 compiled from source on Mandrake
9 "Linux 2.4.19-16mdk" with various configuration options tried, including
enabling and disabling prettysrc. Also seen on 2.8.7dev.7 previously
compiled from source. The problem was not seen with 2.8.5rel.1 compiled
from source with the same options used for 2.8.7dev.8.
Recreating: A test page (test2.html) is attached that has the problem in
various locations. Hopefully it is self explanatory. The best use is to
view the page in lynx, switch to source view, print to a file, and compare
the saved version with the original. Best done with prettysrc disabled.
The problem was also seen when viewing the "source" of e-mails and plain
text files that have spaces at the window width.
Cause: It looks like the problem is caused by the new "TrimmedLength"
subroutine in "GridText.c". Looking through the CHANGES file, this was
probably added in 2.8.6dev.9 in response to Debian #204515. It doesn't
know if the "trailing blanks" on a displayed line are part of a continued
HTML source line that are needed when the displayed lines are joined when
written to a file.
Solution: My solution was to add a check for source mode shortly after
the original string length is determined. If in source mode, just
return the original length and skip all trim operations.
The added code is: if (HTisDocumentSource) return result;
Attached is a patch file created using the 2.8.7dev.8 code and
this command: LC_ALL=C TZ=UTC0 diff -Naur (Original_dir) (Patched_dir)
Caveat: I tested the "Save to a local file", "Print to the screen", and
"View formatted" Printing Options, but didn't test "Mail the file" or
"Print out on a printer attached ...". My change might break something I
didn't test. The problem might also show up with things viewed in normal
(non-source) mode, so the test for HTisDocumentSource wouldn't fix them,
but I didn't find any. There also might be a more elegant solution that
I didn't find.
Mike Knight, just a user
Title: Test wrapping in source view
Test a long line with with spaces between words. Shifted to different columns for testing on different window widths. Where there is a problem, there will be two words merged together.
Lynx will report this as bad HTML due to samples taken out of context.
Sample with one letter and one space.
A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A
A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A
Same with two letters and two spaces.
BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB
B BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB
BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB
BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB BB
Same with three letters and three spaces.
CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC
CC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC
C CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC
CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC
CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC
CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC CCC
Samples from lynx documentation that have problems when viewed with lynx as source on an 80 column wide window and the source is saved with "Save to a local file" or viewed in an editor with "View formatted".
Sample lines from lynx_help/Lynx_users_guide.html
(If color support is instead provided by a color-capable curses
description to determine whether color mode is possible, and
Sample line from lynx_help/lynx-dev.html
[ Lynx-Dev Archive |
Sample line from /WWW/FreeofCharge.html It is a comment, so nothing displayed in the rendered view.
Samples from other web pages that have problems when viewed with lynx as source on an 80 column wide window and the source is saved with "Save to a local file" or viewed in an editor with "View formatted".
Sample lines from Google cached version of a Smart Computing article
| These search terms have been highlighted: | recovery | console | to | rebuild | 400 | million | spontaneously | fixboot | bootcfg |
Sample lines from TV listings
Sample line from a newsletter web page
diff -Naur lynx2-8-7dev.8/src/GridText.c lynx2-8-7dev.8-fix_source/src/GridText.c
--- lynx2-8-7dev.8/src/GridText.c 2008-02-17 22:00:58.000000000 +0000
+++ lynx2-8-7dev.8-fix_source/src/GridText.c 2008-02-21 20:55:17.000000000 +0000
@@ -7849,6 +7849,9 @@
int adjust = result;
unsigned ch;
+ if (HTisDocumentSource)
+ return result;
+
while (adjust > 0) {
ch = UCH(string[adjust - 1]);
if (isspace(ch) || IsSpecialAttrChar(ch)) {
_______________________________________________ Lynx-dev mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/lynx-dev
