All, I think I've found a bug in Gtk+2 gtk_text_buffer_insert() of a '\n' following an existing '\r' in a GTK_TEXT_BUFFER (to convert end-of-line from `CR` (Max pre-OSX) to `CRLF`. When I locate the existing '\r' end-of-line with gtk_text_iter_forward_to_line_end() and gtk_text_iter_forward_char() and check that the next char is NOT '\n', I simply want to insert a '\n' to make the conversion. However, gtk_text_buffer_insert() inserts '\r\n' (0x0d 0x0a) instead of '\n' making the line-end '\r\r\n'. I don't say "I think I've found a bug" lightly, and I can already hear the "You screwed up buddy..." coming, but bear with me.
By simply making the change of gtk_text_buffer_backspace() to remove the exising '\r' and then tk_text_buffer_insert() of a '\r\n' the end-of-line is correctly converted from '\r' to '\r\n'. There is no reason I should have to backspace over '\r' and then insert '\r\n' instead of just inserting '\n'. Thus the bug in how gtk_text_buffer_insert() handles inserting a single '\n' following an existing '\r'. This is a summary of the question (details follow afterwards) Why does the following fail? if (gtk_text_iter_get_char (&iter) != '\n') { gtk_text_buffer_insert (buffer, &iter, app->eolstr[LF], -1); } When the following works? if (gtk_text_iter_get_char (&iter) != '\n') { if (gtk_text_buffer_backspace (buffer, &iter, FALSE, TRUE)) { gtk_text_buffer_get_iter_at_mark (buffer, &iter, app->last_pos); gtk_text_buffer_insert (buffer, &iter, app->eolstr[CRLF], -1); } This is either a bug, or Gtk+2 considers CRLF a single char and will NOT allow manual creation of '\r\n' by inseting a '\n' following an existing '\r' in a buffer. (which Is probably where the bug (or issue) is, but I don't know where to begin to look in the Gtk+2 source... I can remove the exising '\r' and insert '\r\n' and everything is fine, but I cannot place a single '\n' after an existing '\r' in the buffer and have it work. Here are the details for two test cases whittled down to readable examples. First the boiler plate declarations and initialization of the EOL string and struct values: #define EOL_LF "\n" #define EOL_CR "\r" #define EOL_CRLF "\r\n" #define EOL_NO 3 #define EOLNM_LF "LF" #define EOLNM_CR "CR" #define EOLNM_CRLF "CRLF" enum eolorder { LF, CRLF, CR }; /* global constants for LF, CRLF and CR */ Declaration of struct holding values typedef struct { ... gint eol; /* end-of-line */ gchar *eolnm[EOL_NO]; /* ptrs to eol names */ gchar *eolstr[EOL_NO]; /* ptrs to eol strings */ ... GtkTextMark *last_pos; /* position of last match in buf */ ... } kwinst; Initializaitons of struct values passed through app (struct instance is named 'app'): kwinst *app = NULL; /* replaced GtkWidget *window */ app = g_slice_new (kwinst); /* allocate mem for struct */ context_init (app); /* initialize struct values */ Within context_init (app), you have: #ifndef HAVEMSWIN app->eol = LF; /* default line end LF */ #else app->eol = CRLF; /* default line end CRLF */ #endif app->eolstr[0] = EOL_LF; /* eol ending strings */ app->eolstr[1] = EOL_CRLF; app->eolstr[2] = EOL_CR; app->eolnm[0] = EOLNM_LF; /* eol string names */ app->eolnm[1] = EOLNM_CRLF; app->eolnm[2] = EOLNM_CR; The test file for the conversion loaded into buffer is a 'CR' delimited file, e.g. $ hexdump -C eol_cr.txt 00000000 6d 79 0d 64 6f 67 0d 20 20 68 61 73 0d 20 20 66 |my.dog. has. f| 00000010 6c 65 61 73 0d 61 20 6c 6f 74 0d 20 20 20 20 6f |leas.a lot. o| 00000020 66 20 66 6c 65 61 73 0d |f fleas.| 00000028 In human readable form: $ cat eol_cr.txt my dog has fleas a lot of fleas On file open 'CR' end-of-line is properly detected and app-eol is set to CR. On menu choice the user can choose between CR, CRLF and LF line end. The relevant parts of the function called to change from CR to CRLF that exposed the problem is: void buffer_convert_eol (kwinst *app) { GtkTextBuffer *buffer = GTK_TEXT_BUFFER(app->buffer); GtkTextIter iter; ... /* get iter at start of buffer */ gtk_text_buffer_get_start_iter (buffer, &iter); /* set app->last_pos Mark to start, and move on each iteration */ app->last_pos = gtk_text_buffer_create_mark (buffer, "last_pos", &iter, FALSE); /* loop, moving to the end of each line, before the EOL chars */ while (gtk_text_iter_forward_to_line_end (&iter)) { gunichar c = gtk_text_iter_get_char (&iter); gtk_text_buffer_move_mark (buffer, app->last_pos, &iter); if (c == '\n') { /* if end-of-line begins with LF */ ... } else if (c == '\r') { /* if end-of-line begins with CR */ if (app->eol == LF) { /* handle change to LF */ ... } else if (app->eol == CRLF) { /* handle change to CRLF */ gtk_text_iter_forward_char (&iter); if (gtk_text_iter_get_char (&iter) != '\n') { /* if not '\n' */ /* just insert '\n' */ /* CODE THAT PRODUCES THE BUG */ gtk_text_buffer_insert (buffer, &iter, app->eolstr[LF], -1); } ... } gtk_text_buffer_get_iter_at_mark (buffer, &iter, app->last_pos); } ... } The resulting file is: $ hexdump -C eol_messcr.txt 00000000 6d 79 0d 0d 0a 64 6f 67 0d 0d 0a 20 20 68 61 73 |my...dog... has| 00000010 0d 0d 0a 20 20 66 6c 65 61 73 0d 0d 0a 61 20 6c |... fleas...a l| 00000020 6f 74 0d 0d 0a 20 20 20 20 6f 66 20 66 6c 65 61 |ot... of flea| 00000030 73 0d 0d 0a |s...| 00000034 That's just wrong. A '\r\n' is inserted following the existing '\r' instead of '\n' alone. (it looks like gtk_text_buffer_insert() anticipates that a CRLF should be created out of the '\r' and the inserted '\n', but leaves the original '\r' unchanged and, in fact, inserts a CRLF of its own -- bizarre). Here is the same function with the backspace over '\r' and insert of '\r\n' that works as intended: else if (app->eol == CRLF) { /* handle change to CRLF */ gtk_text_iter_forward_char (&iter); if (gtk_text_iter_get_char (&iter) != '\n') { /* if not '\n' */ /* then backspace, reinit iter, and insert '\r\n' */ if (gtk_text_buffer_backspace (buffer, &iter, FALSE, TRUE)) { gtk_text_buffer_get_iter_at_mark (buffer, &iter, app->last_pos); gtk_text_buffer_insert (buffer, &iter, app->eolstr[CRLF], -1); } '\r' is removed and replaced by '\r\n' and the resulting file produced is: $ hexdump -C eol_messcr2.txt 00000000 6d 79 0d 0a 64 6f 67 0d 0a 20 20 68 61 73 0d 0a |my..dog.. has..| 00000010 20 20 66 6c 65 61 73 0d 0a 61 20 6c 6f 74 0d 0a | fleas..a lot..| 00000020 20 20 20 20 6f 66 20 66 6c 65 61 73 0d 0a | of fleas..| 0000002e Why does the following fail? if (gtk_text_iter_get_char (&iter) != '\n') { gtk_text_buffer_insert (buffer, &iter, app->eolstr[LF], -1); } When the following works? if (gtk_text_iter_get_char (&iter) != '\n') { if (gtk_text_buffer_backspace (buffer, &iter, FALSE, TRUE)) { gtk_text_buffer_get_iter_at_mark (buffer, &iter, app->last_pos); gtk_text_buffer_insert (buffer, &iter, app->eolstr[CRLF], -1); } Either I'm crazy and have botched something (which is always possible, but unlikely here), or there is a bug in Gtk+2 (latest) gtk_text_buffer_insert() that is triggered by attempting to insert a one-character string "\n" after an exising '\r' in the buffer. Maybe gtk_text_iter_forward_to_line_end (&iter) correctly moves to the end and gtk_text_iter_forward_char (&iter); correctly moves to the next position follwing '\r', but gtk_text_buffer_insert (buffer, &iter, "\n", -1); fails to insert a '\n' following the exising '\r' and instead inserts '\r\n' -- which is just wrong. Is this a bug? If so, I'll report it. If not, then I'm sure there is a deeper explaination for why the obvious won't work. -- David C. Rankin, J.D.,P.E. _______________________________________________ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org https://mail.gnome.org/mailman/listinfo/gtk-app-devel-list