With this change, xgettext could report common syntactic problems in strings to be extracted. Current built-in checks are ellipsis-unicode, space-ellipsis, and quote-unicode. Those checks can be enabled with --check option of xgettext and disabled with special "xgettext:" comment in source files. Feature suggested by Philip Withnall in: https://savannah.gnu.org/bugs/?44098 * gettext-tools/src/message.h (enum syntax_check_type): New enum. (NSYNTAXCHECKS): New constant. (enum is_syntax_check): New enum. (struct message_ty): New field 'do_syntax_check'. (syntax_check_name): New variable declaration. * gettext-tools/src/message.c (syntax_check_name): New variable. * gettext-tools/src/msgl-cat.c (catenate_msgdomain_list): Propagate mp->do_syntax_check. * gettext-tools/src/msgmerge.c (message_merge): Propagate ref->do_syntax_check. * gettext-tools/src/msgl-check.h (syntax_check_message_list): New declaration. * gettext-tools/src/msgl-check.c (syntax_check_ellipsis_unicode): New function. (syntax_check_space_ellipsis): New function. (syntax_check_quote_unicode): New function. (syntax_check_message): New function. (syntax_check_message_list): New function. * gettext-tools/src/read-catalog-abstract.h (po_parse_comment_special): Adjust function declaration. * gettext-tools/src/read-catalog-abstract.c (po_parse_comment_special): Add new argument SCP for syntax checking; all callers changed. * gettext-tools/src/read-catalog.h (DEFAULT_CATALOG_READER_TY): New field 'do_syntax_check'. * gettext-tools/src/read-catalog.c (default_constructor): Initialize this->do_syntax_check. (default_copy_comment_state): Propagate this->do_syntax_check. * gettext-tools/src/xgettext.c (long_options): Add --check option. (main): Handle --check option. (usage): Document --check option. (remember_a_message): Propagate do_syntax_check value.
* gettext-tools/tests/xgettext-13: New file. * gettext-tools/tests/Makefile.am (TESTS): Add new test. * gettext-tools/doc/xgettext.texi: Document --check option. --- gettext-tools/doc/ChangeLog | 4 + gettext-tools/doc/xgettext.texi | 36 ++++++++ gettext-tools/src/ChangeLog | 39 ++++++++ gettext-tools/src/message.c | 12 +++ gettext-tools/src/message.h | 26 ++++++ gettext-tools/src/msgl-cat.c | 13 +++ gettext-tools/src/msgl-check.c | 144 ++++++++++++++++++++++++++++++ gettext-tools/src/msgl-check.h | 4 +- gettext-tools/src/msgmerge.c | 3 + gettext-tools/src/read-catalog-abstract.c | 35 +++++++- gettext-tools/src/read-catalog-abstract.h | 3 +- gettext-tools/src/read-catalog.c | 8 +- gettext-tools/src/read-catalog.h | 1 + gettext-tools/src/xgettext.c | 67 +++++++++++++- gettext-tools/tests/ChangeLog | 5 ++ gettext-tools/tests/Makefile.am | 1 + gettext-tools/tests/xgettext-13 | 99 ++++++++++++++++++++ 17 files changed, 492 insertions(+), 8 deletions(-) create mode 100755 gettext-tools/tests/xgettext-13 diff --git a/gettext-tools/doc/ChangeLog b/gettext-tools/doc/ChangeLog index edac431..645c580 100644 --- a/gettext-tools/doc/ChangeLog +++ b/gettext-tools/doc/ChangeLog @@ -1,3 +1,7 @@ +2015-02-04 Daiki Ueno <[email protected]> + + * xgettext.texi: Document --check option. + 2015-02-03 Daiki Ueno <[email protected]> * msgexec.texi, msgfilter.texi: Fix markup error caused by commit diff --git a/gettext-tools/doc/xgettext.texi b/gettext-tools/doc/xgettext.texi index 451e25f..1fb4bc1 100644 --- a/gettext-tools/doc/xgettext.texi +++ b/gettext-tools/doc/xgettext.texi @@ -144,6 +144,42 @@ gettext ( The second comment line will not be extracted, because there is one blank line between the comment line and the keyword. +@item -W[@var{CHECK}] +@itemx --check[=@var{CHECK}] +@opindex -W@r{, @code{xgettext} option} +@opindex --check@r{, @code{xgettext} option} +@cindex supported syntax checks, @code{xgettext} +Perform a syntax check on msgid and msgid_plural. The supported checks +are: + +@table @samp +@item ellipsis-unicode +Prefer Unicode ellipsis character over ASCII @code{...} + +@item space-ellipsis +Prohibit whitespace before an ellipsis character + +@item quote-unicode +Prefer Unicode quotation marks over ASCII @code{"'`} + +@end table + +The option has an effect on the all input files. To enable or disable +checks, you can mark it with @code{xgettext:} comment in the source +file. For example, if you specify @code{-Wspace-ellipsis} option, but +want to suppress the check on a particular string, add a special comment: + +@example +/* xgettext: no-space-ellipsis-check */ +gettext ("We really really need to output ..."); +@end example + +The special @code{xgettext:} comment can be followed by flags separated +with a comma. The possible flags are of the form +@samp{[no-]@var{name}-check}, where @var{name} is the name of one +of the valid syntax checks. If a flag is prefixed by @code{no-}, the +meaning is negated. + @end table @subsection Language specific options diff --git a/gettext-tools/src/ChangeLog b/gettext-tools/src/ChangeLog index 633ec9e..7a542b9 100644 --- a/gettext-tools/src/ChangeLog +++ b/gettext-tools/src/ChangeLog @@ -1,3 +1,42 @@ +2015-02-04 Daiki Ueno <[email protected]> + + xgettext: Support message syntax checks + With this change, xgettext could report common syntactic problems + in strings to be extracted. Current built-in checks are + ellipsis-unicode, space-ellipsis, and quote-unicode. Those checks + can be enabled with --check option of xgettext and disabled with + special "xgettext:" comment in source files. + Feature suggested by Philip Withnall in: + https://savannah.gnu.org/bugs/?44098 + * message.h (enum syntax_check_type): New enum. + (NSYNTAXCHECKS): New constant. + (enum is_syntax_check): New enum. + (struct message_ty): New field 'do_syntax_check'. + (syntax_check_name): New variable declaration. + * message.c (syntax_check_name): New variable. + * msgl-cat.c (catenate_msgdomain_list): Propagate + mp->do_syntax_check. + * msgmerge.c (message_merge): Propagate ref->do_syntax_check. + * msgl-check.h (syntax_check_message_list): New declaration. + * msgl-check.c (syntax_check_ellipsis_unicode): New function. + (syntax_check_space_ellipsis): New function. + (syntax_check_quote_unicode): New function. + (syntax_check_message): New function. + (syntax_check_message_list): New function. + * read-catalog-abstract.h (po_parse_comment_special): Adjust + function declaration. + * read-catalog-abstract.c (po_parse_comment_special): Add new + argument SCP for syntax checking; all callers changed. + * read-catalog.h (DEFAULT_CATALOG_READER_TY): New field + 'do_syntax_check'. + * read-catalog.c (default_constructor): Initialize + this->do_syntax_check. + (default_copy_comment_state): Propagate this->do_syntax_check. + * xgettext.c (long_options): Add --check option. + (main): Handle --check option. + (usage): Document --check option. + (remember_a_message): Propagate do_syntax_check value. + 2015-02-03 Daiki Ueno <[email protected]> msgfilter: Factor out quoted string handling diff --git a/gettext-tools/src/message.c b/gettext-tools/src/message.c index 586675f..2596887 100644 --- a/gettext-tools/src/message.c +++ b/gettext-tools/src/message.c @@ -104,6 +104,14 @@ possible_format_p (enum is_format is_format) } +const char *const syntax_check_name[NSYNTAXCHECKS] = +{ + /* sc_ellipsis_unicode */ "ellipsis-unicode", + /* sc_space_ellipsis */ "space-ellipsis", + /* sc_quote_unicode */ "quote-unicode" +}; + + message_ty * message_alloc (const char *msgctxt, const char *msgid, const char *msgid_plural, @@ -130,6 +138,8 @@ message_alloc (const char *msgctxt, mp->range.min = -1; mp->range.max = -1; mp->do_wrap = undecided; + for (i = 0; i < NSYNTAXCHECKS; i++) + mp->do_syntax_check[i] = undecided; mp->prev_msgctxt = NULL; mp->prev_msgid = NULL; mp->prev_msgid_plural = NULL; @@ -235,6 +245,8 @@ message_copy (message_ty *mp) result->is_format[i] = mp->is_format[i]; result->range = mp->range; result->do_wrap = mp->do_wrap; + for (i = 0; i < NSYNTAXCHECKS; i++) + result->do_syntax_check[i] = mp->do_syntax_check[i]; for (j = 0; j < mp->filepos_count; ++j) { lex_pos_ty *pp = &mp->filepos[j]; diff --git a/gettext-tools/src/message.h b/gettext-tools/src/message.h index bf2215a..8b9bc3f 100644 --- a/gettext-tools/src/message.h +++ b/gettext-tools/src/message.h @@ -114,6 +114,29 @@ enum is_wrap #endif +/* Kinds of syntax checks which apply to strings. */ +enum syntax_check_type +{ + sc_ellipsis_unicode, + sc_space_ellipsis, + sc_quote_unicode +}; +#define NSYNTAXCHECKS 3 +extern DLL_VARIABLE const char *const syntax_check_name[NSYNTAXCHECKS]; + +/* Is current msgid subject to a syntax check? */ +#if 0 +enum is_syntax_check +{ + undecided, + yes, + no +}; +#else /* HACK - C's enum concept is so stupid */ +#define is_syntax_check is_format +#endif + + struct altstr { const char *msgstr; @@ -175,6 +198,9 @@ struct message_ty /* Do we want the string to be wrapped in the emitted PO file? */ enum is_wrap do_wrap; + /* Do we want to apply extra syntax checks on the string? */ + enum is_syntax_check do_syntax_check[NSYNTAXCHECKS]; + /* The prev_msgctxt, prev_msgid and prev_msgid_plural strings appearing before the message, if present. Generated by msgmerge. */ const char *prev_msgctxt; diff --git a/gettext-tools/src/msgl-cat.c b/gettext-tools/src/msgl-cat.c index 0bd58d4..8502a64 100644 --- a/gettext-tools/src/msgl-cat.c +++ b/gettext-tools/src/msgl-cat.c @@ -308,6 +308,8 @@ domain \"%s\" in input file '%s' doesn't contain a header entry with a charset s tmp->range.min = - INT_MAX; tmp->range.max = - INT_MAX; tmp->do_wrap = yes; /* may be set to no later */ + for (i = 0; i < NSYNTAXCHECKS; i++) + tmp->do_syntax_check[i] = undecided; /* may be set to yes/no later */ tmp->obsolete = true; /* may be set to false later */ tmp->alternative_count = 0; tmp->alternative = NULL; @@ -535,6 +537,8 @@ UTF-8 encoded from the beginning, i.e. already in your source code files.\n"), tmp->is_format[i] = mp->is_format[i]; tmp->range = mp->range; tmp->do_wrap = mp->do_wrap; + for (i = 0; i < NSYNTAXCHECKS; i++) + tmp->do_syntax_check[i] = mp->do_syntax_check[i]; tmp->prev_msgctxt = mp->prev_msgctxt; tmp->prev_msgid = mp->prev_msgid; tmp->prev_msgid_plural = mp->prev_msgid_plural; @@ -583,6 +587,9 @@ UTF-8 encoded from the beginning, i.e. already in your source code files.\n"), } if (tmp->do_wrap == undecided) tmp->do_wrap = mp->do_wrap; + for (i = 0; i < NSYNTAXCHECKS; i++) + if (tmp->do_syntax_check[i] == undecided) + tmp->do_syntax_check[i] = mp->do_syntax_check[i]; tmp->obsolete = false; } else @@ -635,6 +642,12 @@ UTF-8 encoded from the beginning, i.e. already in your source code files.\n"), } if (mp->do_wrap == no) tmp->do_wrap = no; + for (i = 0; i < NSYNTAXCHECKS; i++) + if (mp->do_syntax_check[i] == yes) + tmp->do_syntax_check[i] = yes; + else if (mp->do_syntax_check[i] == no + && tmp->do_syntax_check[i] == undecided) + tmp->do_syntax_check[i] = no; /* Don't fill tmp->prev_msgid in this case. */ if (!mp->obsolete) tmp->obsolete = false; diff --git a/gettext-tools/src/msgl-check.c b/gettext-tools/src/msgl-check.c index d6f4a3d..30f178d 100644 --- a/gettext-tools/src/msgl-check.c +++ b/gettext-tools/src/msgl-check.c @@ -40,6 +40,7 @@ #include "plural-table.h" #include "c-strstr.h" #include "message.h" +#include "quote.h" #include "gettext.h" #define _(str) gettext (str) @@ -912,3 +913,146 @@ check_message_list (message_list_ty *mlp, return seen_errors; } + + +static int +syntax_check_ellipsis_unicode (const message_ty *mp, const char *msgid) +{ + const char *cp; + int seen_errors = 0; + + for (cp = msgid; *cp != '\0'; cp++) + { + cp = strchrnul (cp, '\n'); + if (cp > msgid + 3 && memcmp (cp - 3, "...", 3) == 0) + { + po_xerror (PO_SEVERITY_ERROR, mp, NULL, 0, 0, false, + _("ASCII ellipsis ('...') instead of Unicode")); + seen_errors++; + } + } + + return seen_errors; +} + + +static int +syntax_check_space_ellipsis (const message_ty *mp, const char *msgid) +{ + /* Coincidentally the lengths of bytes are same for UTF-8 and ASCII + ellipsis. */ + const char *ellipsis + = mp->do_syntax_check[sc_ellipsis_unicode] == yes ? "\xE2\x80\xA6" : "..."; + const char *cp; + int seen_errors = 0; + + for (cp = msgid; *cp != '\0'; cp++) + { + cp = strchrnul (cp, '\n'); + if (cp > msgid + 4 && memcmp (cp - 3, ellipsis, 3) == 0 + && c_isspace (*(cp - 4))) + { + po_xerror (PO_SEVERITY_ERROR, mp, NULL, 0, 0, false, + _("space before ellipsis found in user visible strings")); + seen_errors++; + } + } + + return seen_errors; +} + + +struct callback_arg +{ + const message_ty *mp; + int seen_errors; +}; + +static void +syntax_check_quote_unicode_callback (char quote, const char *quoted, + size_t quoted_length, void *data) +{ + struct callback_arg *arg = data; + + switch (quote) + { + case '"': + po_xerror (PO_SEVERITY_ERROR, arg->mp, NULL, 0, 0, false, + _("ASCII double quote used instead of Unicode")); + arg->seen_errors++; + break; + + case '\'': + po_xerror (PO_SEVERITY_ERROR, arg->mp, NULL, 0, 0, false, + _("ASCII single quote used instead of Unicode")); + arg->seen_errors++; + break; + + default: + break; + } +} + +static int +syntax_check_quote_unicode (const message_ty *mp, const char *msgid) +{ + struct callback_arg arg; + + arg.mp = mp; + arg.seen_errors = 0; + + scan_quoted (msgid, strlen (msgid), + syntax_check_quote_unicode_callback, &arg); + + return arg.seen_errors; +} + + +typedef int (* syntax_check_function) (const message_ty *mp, const char *msgid); +static const syntax_check_function sc_funcs[NSYNTAXCHECKS] = +{ + syntax_check_ellipsis_unicode, + syntax_check_space_ellipsis, + syntax_check_quote_unicode +}; + +/* Perform all syntax checks on a non-obsolete message. + Return the number of errors that were seen. */ +static int +syntax_check_message (const message_ty *mp) +{ + int seen_errors = 0; + int i; + + for (i = 0; i < NSYNTAXCHECKS; i++) + { + if (mp->do_syntax_check[i] == yes) + { + seen_errors += sc_funcs[i] (mp, mp->msgid); + if (mp->msgid_plural) + seen_errors += sc_funcs[i] (mp, mp->msgid_plural); + } + } + + return seen_errors; +} + + +/* Perform all syntax checks on a message list. + Return the number of errors that were seen. */ +int +syntax_check_message_list (message_list_ty *mlp) +{ + int seen_errors = 0; + size_t j; + + for (j = 0; j < mlp->nitems; j++) + { + message_ty *mp = mlp->item[j]; + + if (!is_header (mp)) + seen_errors += syntax_check_message (mp); + } + + return seen_errors; +} diff --git a/gettext-tools/src/msgl-check.h b/gettext-tools/src/msgl-check.h index f03300c..f9d9abd 100644 --- a/gettext-tools/src/msgl-check.h +++ b/gettext-tools/src/msgl-check.h @@ -28,7 +28,6 @@ extern "C" { #endif - /* Check the values returned by plural_eval. Signals the errors through po_xerror. Return the number of errors that were seen. @@ -60,6 +59,9 @@ extern int check_message_list (message_list_ty *mlp, int check_compatibility, int check_accelerators, char accelerator_char); +/* Perform all syntax checks on a message list. + Return the number of errors that were seen. */ +extern int syntax_check_message_list (message_list_ty *mlp); #ifdef __cplusplus } diff --git a/gettext-tools/src/msgmerge.c b/gettext-tools/src/msgmerge.c index 0415b2a..71d8962 100644 --- a/gettext-tools/src/msgmerge.c +++ b/gettext-tools/src/msgmerge.c @@ -1330,6 +1330,9 @@ message_merge (message_ty *def, message_ty *ref, bool force_fuzzy, result->do_wrap = ref->do_wrap; + for (i = 0; i < NSYNTAXCHECKS; i++) + result->do_syntax_check[i] = ref->do_syntax_check[i]; + /* Insert previous msgid, commented out with "#|". Do so only when --previous is specified, for backward compatibility. Since the "previous msgid" represents the original msgid that led to diff --git a/gettext-tools/src/read-catalog-abstract.c b/gettext-tools/src/read-catalog-abstract.c index d4e98ee..0817cd7 100644 --- a/gettext-tools/src/read-catalog-abstract.c +++ b/gettext-tools/src/read-catalog-abstract.c @@ -262,7 +262,8 @@ po_callback_comment_special (const char *s) void po_parse_comment_special (const char *s, bool *fuzzyp, enum is_format formatp[NFORMATS], - struct argument_range *rangep, enum is_wrap *wrapp) + struct argument_range *rangep, enum is_wrap *wrapp, + enum is_syntax_check scp[NSYNTAXCHECKS]) { size_t i; @@ -272,6 +273,8 @@ po_parse_comment_special (const char *s, rangep->min = -1; rangep->max = -1; *wrapp = undecided; + for (i = 0; i < NSYNTAXCHECKS; i++) + scp[i] = undecided; while (*s != '\0') { @@ -405,6 +408,36 @@ po_parse_comment_special (const char *s, continue; } + /* Accept syntax check description. */ + if (len >= 6 && memcmp (t + len - 6, "-check", 6) == 0) + { + const char *p; + size_t n; + enum is_syntax_check value; + + p = t; + n = len - 6; + + if (n >= 3 && memcmp (p, "no-", 3) == 0) + { + p += 3; + n -= 3; + value = no; + } + else + value = yes; + + for (i = 0; i < NSYNTAXCHECKS; i++) + if (strlen (syntax_check_name[i]) == n + && memcmp (syntax_check_name[i], p, n) == 0) + { + scp[i] = value; + break; + } + if (i < NSYNTAXCHECKS) + continue; + } + /* Unknown special comment marker. It may have been generated from a future xgettext version. Ignore it. */ } diff --git a/gettext-tools/src/read-catalog-abstract.h b/gettext-tools/src/read-catalog-abstract.h index c3fc84f..367584b 100644 --- a/gettext-tools/src/read-catalog-abstract.h +++ b/gettext-tools/src/read-catalog-abstract.h @@ -184,7 +184,8 @@ extern void po_callback_comment_dispatcher (const char *s); extern void po_parse_comment_special (const char *s, bool *fuzzyp, enum is_format formatp[NFORMATS], struct argument_range *rangep, - enum is_wrap *wrapp); + enum is_wrap *wrapp, + enum is_syntax_check scp[NSYNTAXCHECKS]); #ifdef __cplusplus diff --git a/gettext-tools/src/read-catalog.c b/gettext-tools/src/read-catalog.c index 4642249..8c77df1 100644 --- a/gettext-tools/src/read-catalog.c +++ b/gettext-tools/src/read-catalog.c @@ -105,6 +105,8 @@ default_constructor (abstract_catalog_reader_ty *that) this->range.min = -1; this->range.max = -1; this->do_wrap = undecided; + for (i = 0; i < NSYNTAXCHECKS; i++) + this->do_syntax_check[i] = undecided; } @@ -172,6 +174,8 @@ default_copy_comment_state (default_catalog_reader_ty *this, message_ty *mp) mp->is_format[i] = this->is_format[i]; mp->range = this->range; mp->do_wrap = this->do_wrap; + for (i = 0; i < NSYNTAXCHECKS; i++) + mp->do_syntax_check[i] = this->do_syntax_check[i]; } @@ -205,6 +209,8 @@ default_reset_comment_state (default_catalog_reader_ty *this) this->range.min = -1; this->range.max = -1; this->do_wrap = undecided; + for (i = 0; i < NSYNTAXCHECKS; i++) + this->do_syntax_check[i] = undecided; } @@ -299,7 +305,7 @@ default_comment_special (abstract_catalog_reader_ty *that, const char *s) default_catalog_reader_ty *this = (default_catalog_reader_ty *) that; po_parse_comment_special (s, &this->is_fuzzy, this->is_format, &this->range, - &this->do_wrap); + &this->do_wrap, this->do_syntax_check); } diff --git a/gettext-tools/src/read-catalog.h b/gettext-tools/src/read-catalog.h index f567d78..74e0fd7 100644 --- a/gettext-tools/src/read-catalog.h +++ b/gettext-tools/src/read-catalog.h @@ -113,6 +113,7 @@ struct default_catalog_reader_class_ty enum is_format is_format[NFORMATS]; \ struct argument_range range; \ enum is_wrap do_wrap; \ + enum is_syntax_check do_syntax_check[NSYNTAXCHECKS]; \ typedef struct default_catalog_reader_ty default_catalog_reader_ty; struct default_catalog_reader_ty diff --git a/gettext-tools/src/xgettext.c b/gettext-tools/src/xgettext.c index f9156eb..12b3f54 100644 --- a/gettext-tools/src/xgettext.c +++ b/gettext-tools/src/xgettext.c @@ -58,6 +58,8 @@ #include "po-charset.h" #include "msgl-iconv.h" #include "msgl-ascii.h" +#include "msgl-check.h" +#include "po-xerror.h" #include "po-time.h" #include "write-catalog.h" #include "write-po.h" @@ -179,6 +181,9 @@ static bool recognize_format_kde; /* If true, recognize Boost format strings. */ static bool recognize_format_boost; +/* Syntax checks enabled by default. */ +static enum is_syntax_check default_syntax_check[NSYNTAXCHECKS]; + /* Canonicalized encoding name for all input files. */ const char *xgettext_global_source_encoding; @@ -204,6 +209,7 @@ static const struct option long_options[] = { "add-location", optional_argument, NULL, 'n' }, { "boost", no_argument, NULL, CHAR_MAX + 11 }, { "c++", no_argument, NULL, 'C' }, + { "check", required_argument, NULL, 'W' }, { "color", optional_argument, NULL, CHAR_MAX + 14 }, { "copyright-holder", required_argument, NULL, CHAR_MAX + 1 }, { "debug", no_argument, &do_debug, 1 }, @@ -346,7 +352,7 @@ main (int argc, char *argv[]) init_flag_table_vala (); while ((optchar = getopt_long (argc, argv, - "ac::Cd:D:eEf:Fhijk::l:L:m::M::no:p:sTVw:x:", + "ac::Cd:D:eEf:Fhijk::l:L:m::M::no:p:sTVw:W:x:", long_options, NULL)) != EOF) switch (optchar) { @@ -525,6 +531,17 @@ main (int argc, char *argv[]) } break; + case 'W': + if (strcmp (optarg, "ellipsis-unicode") == 0) + default_syntax_check[sc_ellipsis_unicode] = yes; + else if (strcmp (optarg, "space-ellipsis") == 0) + default_syntax_check[sc_space_ellipsis] = yes; + else if (strcmp (optarg, "quote-unicode") == 0) + default_syntax_check[sc_quote_unicode] = yes; + else + error (EXIT_FAILURE, 0, _("syntax check '%s' unknown"), optarg); + break; + case 'x': read_exclusion_file (optarg); break; @@ -836,6 +853,24 @@ warning: file '%s' extension '%s' is unknown; will try C"), filename, extension) else if (sort_by_msgid) msgdomain_list_sort_by_msgid (mdlp); + /* Check syntax of messages. */ + { + int nerrors = 0; + + for (i = 0; i < mdlp->nitems; i++) + { + message_list_ty *mlp = mdlp->item[i]->messages; + nerrors = syntax_check_message_list (mlp); + } + + /* Exit with status 1 on any error. */ + if (nerrors > 0) + error (EXIT_FAILURE, 0, + ngettext ("found %d fatal error", "found %d fatal errors", + nerrors), + nerrors); + } + /* Write the PO file. */ msgdomain_list_print (mdlp, file_name, output_syntax, force_po, do_debug); @@ -921,6 +956,10 @@ Operation mode:\n")); preceding keyword lines in output file\n\ -c, --add-comments place all comment blocks preceding keyword lines\n\ in output file\n")); + printf (_("\ + -W, --check=NAME perform syntax check on messages\n\ + (ellipsis-unicode, space-ellipsis,\n\ + quote-unicode)\n")); printf ("\n"); printf (_("\ Language specific options:\n")); @@ -1644,8 +1683,8 @@ xgettext_record_flag (const char *optionstring) flag += 5; } - /* Unlike po_parse_comment_special(), we don't accept "fuzzy" or "wrap" - here - it has no sense. */ + /* Unlike po_parse_comment_special(), we don't accept "fuzzy", + "wrap", or "check" here - it has no sense. */ if (strlen (flag) >= 7 && memcmp (flag + strlen (flag) - 7, "-format", 7) == 0) { @@ -2238,6 +2277,7 @@ remember_a_message (message_list_ty *mlp, char *msgctxt, char *msgid, enum is_format is_format[NFORMATS]; struct argument_range range; enum is_wrap do_wrap; + enum is_syntax_check do_syntax_check[NSYNTAXCHECKS]; message_ty *mp; char *msgstr; size_t i; @@ -2264,6 +2304,8 @@ remember_a_message (message_list_ty *mlp, char *msgctxt, char *msgid, range.min = -1; range.max = -1; do_wrap = undecided; + for (i = 0; i < NSYNTAXCHECKS; i++) + do_syntax_check[i] = undecided; if (msgctxt != NULL) CONVERT_STRING (msgctxt, lc_string); @@ -2297,6 +2339,8 @@ meta information, not the empty string.\n"))); for (i = 0; i < NFORMATS; i++) is_format[i] = mp->is_format[i]; do_wrap = mp->do_wrap; + for (i = 0; i < NSYNTAXCHECKS; i++) + do_syntax_check[i] = mp->do_syntax_check[i]; } else { @@ -2376,12 +2420,13 @@ meta information, not the empty string.\n"))); enum is_format tmp_format[NFORMATS]; struct argument_range tmp_range; enum is_wrap tmp_wrap; + enum is_syntax_check tmp_syntax_check[NSYNTAXCHECKS]; bool interesting; t += strlen ("xgettext:"); po_parse_comment_special (t, &tmp_fuzzy, tmp_format, &tmp_range, - &tmp_wrap); + &tmp_wrap, tmp_syntax_check); interesting = false; for (i = 0; i < NFORMATS; i++) @@ -2400,6 +2445,12 @@ meta information, not the empty string.\n"))); do_wrap = tmp_wrap; interesting = true; } + for (i = 0; i < NSYNTAXCHECKS; i++) + if (tmp_syntax_check[i] != undecided) + { + do_syntax_check[i] = tmp_syntax_check[i]; + interesting = true; + } /* If the "xgettext:" marker was followed by an interesting keyword, and we updated our is_format/do_wrap variables, @@ -2525,6 +2576,14 @@ meta information, not the empty string.\n"))); mp->do_wrap = do_wrap == no ? no : yes; /* By default we wrap. */ + for (i = 0; i < NSYNTAXCHECKS; i++) + { + if (do_syntax_check[i] == undecided) + do_syntax_check[i] = default_syntax_check[i] == yes ? yes : no; + + mp->do_syntax_check[i] = do_syntax_check[i]; + } + /* Warn about the use of non-reorderable format strings when the programming language also provides reorderable format strings. */ warn_format_string (is_format, mp->msgid, pos, "msgid"); diff --git a/gettext-tools/tests/ChangeLog b/gettext-tools/tests/ChangeLog index eec1586..9223edd 100644 --- a/gettext-tools/tests/ChangeLog +++ b/gettext-tools/tests/ChangeLog @@ -1,3 +1,8 @@ +2015-02-04 Daiki Ueno <[email protected]> + + * xgettext-13: New file. + * Makefile.am (TESTS): Add new test. + 2015-01-29 Daiki Ueno <[email protected]> * msgexec-6: New file. diff --git a/gettext-tools/tests/Makefile.am b/gettext-tools/tests/Makefile.am index ee34655..32bc192 100644 --- a/gettext-tools/tests/Makefile.am +++ b/gettext-tools/tests/Makefile.am @@ -72,6 +72,7 @@ TESTS = gettext-1 gettext-2 gettext-3 gettext-4 gettext-5 gettext-6 gettext-7 \ recode-sr-latin-1 recode-sr-latin-2 \ xgettext-2 xgettext-3 xgettext-4 xgettext-5 xgettext-6 \ xgettext-7 xgettext-8 xgettext-9 xgettext-10 xgettext-11 xgettext-12 \ + xgettext-13 \ xgettext-awk-1 xgettext-awk-2 \ xgettext-c-2 xgettext-c-3 xgettext-c-4 xgettext-c-5 \ xgettext-c-6 xgettext-c-7 xgettext-c-8 xgettext-c-9 xgettext-c-10 \ diff --git a/gettext-tools/tests/xgettext-13 b/gettext-tools/tests/xgettext-13 new file mode 100755 index 0000000..32107f2 --- /dev/null +++ b/gettext-tools/tests/xgettext-13 @@ -0,0 +1,99 @@ +#!/bin/sh +. "${srcdir=.}/init.sh"; path_prepend_ . ../src + +# Test for --check option. + +# --check=ellipsis-unicode +cat <<\EOF > xg-ellipsis-u.c +gettext ("this is a sentence..."); + +ngettext ("this is a sentence", "these are sentences...", 2); + +/* xgettext: no-ellipsis-unicode-check */ +gettext ("this is another sentence..."); + +gettext ("this is a multiline sentence\n" + "and the second line...\n" + "ends with an ellipsis\n"); +EOF + +: ${XGETTEXT=xgettext} +LANGUAGE= LC_ALL=C ${XGETTEXT} --omit-header --add-comments --check=ellipsis-unicode -d xg-ellipsis-u.tmp xg-ellipsis-u.c 2>xg-ellipsis-u.err + +test `grep -c 'ASCII ellipsis' xg-ellipsis-u.err` = 3 || exit 1 + +# --check=space-ellipsis +cat <<\EOF > xg-space-e.c +gettext ("this is a sentence ..."); + +/* xgettext: no-space-ellipsis-check, no-ellipsis-unicode-check */ +gettext ("this is another sentence ..."); + +gettext ("this is a multiline sentence\n" + "and the second line ...\n" + "ends with an ellipsis\n"); +EOF + +LANGUAGE= LC_ALL=C ${XGETTEXT} --omit-header --add-comments --check=space-ellipsis -d xg-space-e.tmp xg-space-e.c 2>xg-space-e.err + +test `grep -c 'space before ellipsis' xg-space-e.err` = 2 || exit 1 + +# Combination of --check=space-ellipsis and --check=ellipsis-unicode. +LANGUAGE= LC_ALL=C ${XGETTEXT} --omit-header --add-comments --check=ellipsis-unicode --check=space-ellipsis -d xg-space-eu.tmp xg-space-e.c 2>xg-space-eu.err + +test `grep -c 'ASCII ellipsis' xg-space-eu.err` = 2 || exit 1 + +# --check=quote-unicode +cat <<\EOF > xg-quote-u.c +gettext ("\"double quoted\""); + +/* xgettext: no-quote-unicode-check */ +gettext ("\"double quoted but ignored\""); + +gettext ("double quoted but empty \"\""); + +gettext ("\"\" double quoted but empty"); + +gettext ("\"foo\" \"bar\" \"baz\""); + +gettext ("'single quoted'"); + +/* xgettext: no-quote-unicode-check */ +gettext ("'single quoted but ignored'"); + +gettext ("'foo' 'bar' 'baz'"); + +gettext ("prefix'single quoted without surrounding spaces'suffix"); + +gettext ("prefix 'single quoted with surrounding spaces' suffix"); + +gettext ("single quoted with apostrophe, empty '' "); + +gettext ("'single quoted at the beginning of string' "); + +gettext (" 'single quoted at the end of string'"); + +gettext ("line 1\n" +"'single quoted at the beginning of line' \n" +"line 3"); + +gettext ("line 1\n" +" 'single quoted at the end of line'\n" +"line 3"); + +gettext ("`single quoted with grave'"); + +/* xgettext: no-quote-unicode-check */ +gettext ("`single quoted with grave but ignored'"); + +gettext ("single quoted with grave, empty `'"); + +gettext ("`' single quoted with grave, empty"); + +gettext ("`double grave`"); +EOF + +LANGUAGE= LC_ALL=C ${XGETTEXT} --omit-header --add-comments --check=quote-unicode -d xg-quote-u.tmp xg-quote-u.c 2>xg-quote-u.err + +test `grep -c 'ASCII double quote' xg-quote-u.err` = 4 || exit 1 +test `grep -c 'ASCII single quote' xg-quote-u.err` = 12 || exit 1 -- 2.1.0
