[Rd] [WISH / PATCH] possibility to split string literals across multiple lines

Andreas Kersting Wed, 14 Jun 2017 03:59:34 -0700

Hi,

I would really like to have a way to split long string literals acrossmultiple lines in R.

Currently, if a string literal spans multiple lines, there is no way toinhibit the introduction of newline characters:


> "aaa
+ bbb"
[1] "aaa\nbbb"


If a line ends with a backslash, it is just ignored:

> "aaa\
+ bbb"
[1] "aaa\nbbb"

We could use this fact to implement string splitting in a fairlybackward-compatible way, since currently such trailing backslashesshould hardly be used as they do not have any effect. The attached patchmakes the parser ignore a newline character directly following a backslash:


> "aaa\
+ bbb"
[1] "aaabbb"

I personally would also prefer if leading blanks (spaces and tabs) inthe second line are ignored to allow for proper indentation:


>   "aaa \
+    bbb"
[1] "aaa bbb"

>   "aaa\
+    \ bbb"
[1] "aaa bbb"

This is also implemented by this patch.


An alternative approach could be to have something like

("aaa "
"bbb")

or

("aaa ",
"bbb")

be interpreted as "aaa bbb".

I don't know the ins and outs of the parser of R (hence: please verycarefully review the attached patch), but I guess this would be morework to implement!?

What do you think? Is there anybody else who is missing this feature inthe first place?


Regards,
Andreas

Index: src/main/gram.c
===================================================================
--- src/main/gram.c	(Revision 72789)
+++ src/main/gram.c	(Arbeitskopie)
@@ -4646,10 +4646,17 @@
     int wcnt = 0;
     ucs_t wcs[10001];
     Rboolean oct_or_hex = FALSE, use_wcs = FALSE, currtext_truncated = FALSE;
+    Rboolean backslash_was_newline = FALSE, ignore_next_blank = FALSE;
 
     CTEXT_PUSH(c);
     while ((c = xxgetc()) != R_EOF && c != quote) {
 	CTEXT_PUSH(c);
+	if (ignore_next_blank) {
+	    if (c == ' ' || c == '\t')
+	        continue;
+	    else
+	        ignore_next_blank = FALSE;
+	} 
 	if (c == '\n') {
 	    xxungetc(c); CTEXT_POP();
 	    /* Fix suggested by Mark Bravington to allow multiline strings
@@ -4657,6 +4664,7 @@
 	     * return ERROR;
 	     */
 	    c = '\\';
+	    backslash_was_newline = TRUE;
 	}
 	if (c == '\\') {
 	    c = xxgetc(); CTEXT_PUSH(c);
@@ -4815,8 +4823,14 @@
 		case '\'':
 		case '`':
 		case ' ':
+		    break;
 		case '\n':
-		    break;
+		    if (backslash_was_newline) {
+		        backslash_was_newline = FALSE;
+		        break;
+		    }
+		    ignore_next_blank = TRUE;
+		    continue;
 		default:
 		    *ct = '\0';
 		    errorcall(R_NilValue, _("'\\%c' is an unrecognized escape in character string starting \"%s\""), c, currtext);
Index: src/main/gram.y
===================================================================
--- src/main/gram.y	(Revision 72789)
+++ src/main/gram.y	(Arbeitskopie)
@@ -2308,10 +2308,17 @@
     int wcnt = 0;
     ucs_t wcs[10001];
     Rboolean oct_or_hex = FALSE, use_wcs = FALSE, currtext_truncated = FALSE;
+    Rboolean backslash_was_newline = FALSE, ignore_next_blank = FALSE;
 
     CTEXT_PUSH(c);
     while ((c = xxgetc()) != R_EOF && c != quote) {
 	CTEXT_PUSH(c);
+	if (ignore_next_blank) {
+	    if (c == ' ' || c == '\t')
+	        continue;
+	    else
+	        ignore_next_blank = FALSE;
+	} 
 	if (c == '\n') {
 	    xxungetc(c); CTEXT_POP();
 	    /* Fix suggested by Mark Bravington to allow multiline strings
@@ -2319,6 +2326,7 @@
 	     * return ERROR;
 	     */
 	    c = '\\';
+	    backslash_was_newline = TRUE;
 	}
 	if (c == '\\') {
 	    c = xxgetc(); CTEXT_PUSH(c);
@@ -2477,8 +2485,14 @@
 		case '\'':
 		case '`':
 		case ' ':
+		    break;
 		case '\n':
-		    break;
+		    if (backslash_was_newline) {
+		        backslash_was_newline = FALSE;
+		        break;
+		    }
+		    ignore_next_blank = TRUE;
+		    continue;
 		default:
 		    *ct = '\0';
 		    errorcall(R_NilValue, _("'\\%c' is an unrecognized escape in character string starting \"%s\""), c, currtext);

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [WISH / PATCH] possibility to split string literals across multiple lines

Reply via email to