Re: let g_warn_if_fail replace g_assert
Il giorno mer, 17/10/2007 alle 11.56 +0200, Tim Janik ha scritto: - add g_warn_if_fail (condition); which produces a critical warning about failing assertions but contrary to g_assert returns. If it's called g_warn_if_fail() I would expect a g_warning() not a g_critical(). -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: turning g_assert* into warnings
Il giorno ven, 12/10/2007 alle 15.16 +0200, Tim Janik ha scritto: please reread my reasoning about G_DISABLE_ASSERT, there already is no behavior of g_assert() you could rely on. (and some distributions do build their binaries with G_DISABLE_ASSERT and/or G_DISABLE_CHECKS defined). What distributions? Excluding Gentoo and other distros that allow the user to choose how to build everything. -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Performance implications of GRegex structure
Il giorno gio, 15/03/2007 alle 10.18 -0400, Owen Taylor ha scritto: But looking over the header file, there is something that puzzles me about the way that it's set up: there is no distinction between a pattern/regular expression object and a match/matcher object. The internal code in GRegex was deeply modified but the API is quite similar to the original one written by Scott Wimer and then modified by Matthias Clasen, so I kept a single GRegex object but with lots of doubts. In the end I decided to keep a single object because I prefer this approach when using languages without a garbage collector and because QRegExp (the equivalent object in QT) is a single object. This matter was brought out in the mailing list and in bugzilla but only Havoc Pennington and Yevgen Muntyan expressed their opinion saying that they prefer a single object. BTW if you want I can split GRegex in two separate objects. -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: GRegex(win32) : 500 tests passed, 3 failed
Il giorno gio, 15/03/2007 alle 18.41 +0100, Hans Breuer ha scritto: with only small modifications I was able to compile GRegex with msvc, thanks for providing an almost working makefile.msc ;-) [...] But now for the question: are these 3 failed specific to my build so I should investigate them further? It's my fault, I wrote makefile.msc (without testing it) before the release of PCRE 7. PCRE 6.x can recognize as a newline one of \n, \r or \r\n. PCRE 7.x added the ability to match any newline character, so I changed the default value from 10 (\n) to -1 (PCRE_NEWLINE_ANY) in Makefile.am but not in makefile.msc. Sorry :) -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: GRegex
Il giorno sab, 28/10/2006 alle 19.35 +0200, Murray Cumming ha scritto: If it's possible, it would be nice to avoid making it a GObject just to add easy reference counting. That tends to restrict how it can be wrapped by language bindings for whom automatic memory management is not the default. It can't be a GObject because GRegex will be in libglib. I don't know exactly how it might be done in C (it's easy in C++), but I would hope that there's some way to reference-count anything without forcing the object itself to do the reference counting. What do you mean? GRegex handles ref counting as other structures in GLib. -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: GRegex
On 10/24/06, Behdad Esfahbod [EMAIL PROTECTED] wrote: On Tue, 2006-10-24 at 16:05 -0400, Marco Barisione wrote: This is broken. It should err at configure time, not run time. The user shouldn't need to check the output of g_regex_new for failures, just like any other thing we do with glib. I have just uploaded a new patch that corrects this and some other problems. I kept the run-time check, it's useful if cross-compiling or if the installed PCRE library is updated. -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: GRegex
On 10/24/06, Marco Barisione [EMAIL PROTECTED] wrote: As discussed some times ago [1] I propose to add a PCRE wrapper to GLib. Bug #50075 [2] contains a patch that adds it as a separate libgregex. The documentation of the new API is at [3] (yes, there are some unresolved problems with gtk-doc). Owen Taylor would prefer to have GRegex directly in the main GLib library: [...] To give you an idea of the size of libgregex and libpcre, these are the sizes of the stripped .so files on my computer: libgregex with internal PCRE 138 KB libgregex with system PCRE24 KB libpcre with Unicode support 125 KB libpcre without Unicode support 96 KB -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
GRegex
As discussed some times ago [1] I propose to add a PCRE wrapper to GLib. Bug #50075 [2] contains a patch that adds it as a separate libgregex. The documentation of the new API is at [3] (yes, there are some unresolved problems with gtk-doc). Owen Taylor would prefer to have GRegex directly in the main GLib library: (17:38:55) owen: is the latest plan for gregex really a separate library? (17:39:45) mclasen: owen: you would prefer it folded in ? (17:40:16) owen: mclasen: I think it makes tons more sense folded in. A regular expression facility is most useful if you can just use it when you need it (17:40:36) owen: mclasen: And on the desktop, having it folded in is purely a performance win (17:41:36) owen: if there is an embedded problem (how big is it anyways?) then a --without-regex configure option would be better (17:43:19) mclasen: owen: you are probably right What are your ideas? I would like to add to the documentation a simple and short tutorial on regular expressions and GRegex API. Does someone know something good (and with a compatible license) to copy? [1] http://mail.gnome.org/archives/gtk-devel-list/2006-July/msg00099.html [2] http://bugzilla.gnome.org/show_bug.cgi?id=50075 [3] http://www.barisione.org/gregex/ -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: GRegex
Il giorno mar, 24/10/2006 alle 13.17 -0400, Dominic Lachowicz ha scritto: 1) Please don't name variables 'string', as there may be a conflict with C++'s std::string I think they were called string in the original version of GRegex written by Scott Wimer in 1999. PCRE calls the string subject. However it's not a problem with C++, this program is valid: #include string #include iostream using namespace std; int main () { string string = hello; cout string endl; } 2) I noticed that there are g_regex_ref/unref() methods. Why did you choose to do this, rather than subclass GObject? You would also then have easy GObject-style accessors for the regex's pattern and match_options. The original plan was to include directly GRegex in GLib, so it cannot depend on GObject. This could be changed if we decide to include GRegex in a separate library. However is really necessary to have a real object? I added _ref and _unref because the only two programs that are currently using my modified version of EggRegex are GtkSourceView and MooEdit.Both programs need reference counting for regular expressions. In Glib there are other structures that are reference counted without being objects, such as GHashTable, GAsyncQueue, GIOChannel and others. 3) Should there be a GRegexMatch object too? For instance, at least Python and Java have a notion of a read-only Pattern and a Match Set. Your design combines the two into a single GRegex object. Having the pattern be read-only gets around your thread-safety gotcha comment in the docs. I know this but using them in a language with garbage collector is easier. The regex class in QT uses the same approach of GRegex. 4) Python's search() and match() methods have a start position and an end position argument, while your match_full() has only a start position argument. Is there a reason for this? Could it be implemented? It has a length argument. 5) I didn't fully investigate, but Java and Python have a concept of search vs. match with slightly different semantics. Is this semantic distinction easily expressible in your API? http://docs.python.org/lib/re-objects.html In Python match matches only at the start of the string, search at any position. You can have the match behavior adding a ^ at the beginning of the string or passing the compile option G_REGEX_ANCHORED or the match option G_REGEX_MATCH_ANCHORED. I prefer to have only a function as I always this distinction in Python a bit confusing. 6) GRegex requires that PCRE be built with UTF-8 support, which some existing installations aren't. For reference, Gnumeric and Goffice get around this by including a copy of PCRE in their distribution and statically link it in. How do you ensure that GRegex finds a version of PCRE compiled with UTF-8 support? The default for GRegex is to use its internal copy of PCRE. This is automatically patched to use GLib for Unicode and memory management. If you prefer you can pass --enable-system-pcre to use the system-supplied library but, if it's compiled without utf-8 support, g_regex_new fails. -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: EggRegex
Marco Barisione wrote: My version of EggRegex is at http://techn.ocracy.org/eggregex/ and a copy of the documentation is at http://www.barisione.org/eggregex/ And a tar.gz generated by make dist is at http://www.barisione.org/eggregex/eggregex-0.1.tar.gz In these days I did some changes, IMHO EggRegex is now decent and usable. Is there someone that can review the code (and the docs as I'm a non-english speaker)? -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: EggRegex
Marco Barisione wrote: Can someone take a look to pcre/ucptable.c, pcre/ucp.h and pcre/pcre_ucp_searchfuncs.c? Now the internal PCRE uses glib for Unicode properties. There is a problem, PCRE allows script names in \p{}, so you can match an arabic character using \p{Arabic}. But AFAIK glib does not know about scripts. gucharmap handles this internally but I can't copy the code because, as far as I know, it's under GPL and not LGPL. However I think that the better solution is to add this directly to glib: typedef enum { G_UNICODE_SCRIPT_ARABIC, G_UNICODE_SCRIPT_ARMENIAN, ... G_UNICODE_SCRIPT_UGARITIC } GUnicodeScript; /* returns the script of c */ GUnicodeScript g_unichar_get_script(gunichar c); /* returns the (translated?) name of the script */ const gchar *g_unichar_get_script_name(GUnicodeScript script); -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: EggRegex
Marco Barisione wrote: gucharmap handles this internally but I can't copy the code because, as far as I know, it's under GPL and not LGPL. I was wrong, it's in the library and not in the app so it's LGPLed. What should I do with the scripts? Obviously eggregex cannot depend of libgucharmap. -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: EggRegex
Owen Taylor wrote: Considering that a large amount of the size of GLib is Unicode tables, it's almost certainly better that a few apps have two copies of the PCRE code than all processes have two copies of the Unicode tables. Using the internal copy, if there is a security bug in PCRE, distros have to update two libraries instead of just libpcre. Can someone take a look to pcre/ucptable.c, pcre/ucp.h and pcre/pcre_ucp_searchfuncs.c? The files are here: http://techn.ocracy.org/eggregex/?f=03897669abe3;file=pcre/ucp.h;style=raw http://techn.ocracy.org/eggregex/?f=3ad939693cb3;file=pcre/ucptable.c;style=raw http://techn.ocracy.org/eggregex/?f=b567294355b0;file=pcre/pcre_ucp_searchfuncs.c;style=raw I need some advice to do this. -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: EggRegex
Matthias Clasen wrote: When I was last looking at regular expressions for GLib (which resulted in the current eggregex code), the first decision was to go for Perl regular expression, rather than posix. That naturally leads to PCRE. The main gripe with PCRE was (and is) that it had (and probably still has) relatively limited Unicode support. The version of eggregex in libegg uses the three years old pcre 4.5. Now pcre 6.7 has a better support for Unicode. Now PCRE: - handles UTF-8 - knows that, doing a caseless match, à matches À - has generic character types for non ASCII characters, so \p{Lt} matches a title case letter, \p{Sc} matches a currency symbol, and so on Extended properties such as Greek or InMusicalSymbols are not supported. And it brings its own implementation of the necessary Unicode data, instead of using the GLib one. Yes, but it shouldn't be too difficult to port pcre to use glib for Unicode. I can't do it because my knowledge of Unicode is very limited. However this would mean that we should always use the internal PCRE instead of the system supplied one. -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
EggRegex
Hi, GtkSourceView 2 will have a new syntax highlighting engine that will require a more powerful and fast regular expression library. This is why I worked on EggRegex (a wrapper library around PCRE) to correct bugs and to add new features. My version of EggRegex is at http://techn.ocracy.org/eggregex/ and a copy of the documentation is at http://www.barisione.org/eggregex/ EggRegex was originally written by Scott Wimer to be included in glib, renaming it to GRegex. However including it in glib would mean adding a dependency to libpcre or linking it statically increasing the size of glib (a stripped libeggregex is 144 KB on my computer). So my question is: what should be the future of EggRegex? If it will not be included in glib what do you think about having a separate libgregex? If it will be a separate library can I use the name GRegex or should I choose another name without using the G namespace? -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: EggRegex
Behdad Esfahbod wrote: Last time I checked, PCRE's didn't use Unicode Character Database to classify characters and so is a poor choice for a highlighting engine and definitely suboptimal in GNOME. It supports utf-8 and Unicode properties. Don't ask me more about this because I know very little about Unicode :) I believe GNOME should use the GNU regexp engine. It's slower and doesn't support some patterns supported by pcre. PCRE benefits: - it's faster - has more advanced regular expressions - supports partial matching (using the pattern ^ab against the string a the match fails but pcre knows that there is a partial match so adding more characters may lead to a match), see http://www.barisione.org/eggregex/eggregex-eggregex.html#egg-regex-is-partial-match - DFA matching (matching .* against abc you get a, ab and abc) -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: EggRegex
Simos Xenitellis wrote: Per http://www.pcre.org/pcre.txt The current implementation of PCRE (release 6.x) corresponds approximately with Perl 5.8, including support for UTF-8 encoded strings and Unicode general category properties. However, this support has to be explicitly enabled; it is not the default. Today most distributions ship a copy of pcre that supports utf-8 and unicode properties. However you can pass --enable-internal-pcre to configure to statically link an internal copy of pcre. If eggregex links to a pcre version without unicode egg_regex_new() prints an error message and fails. -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: EggRegex
Tristan Van Berkom wrote: If eggregex links to a pcre version without unicode egg_regex_new() prints an error message and fails. Are you sugesting that highlighting be a site-dependant feature ? i.e. g_regexp_supported() ... similar to g_thread_supported() ? No, I'm saying that if you link against a pcre that does not support Unicode you will see immediately that something is not working, so you can use the internal copy of pcre. -- Marco Barisione http://www.barisione.org/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Gtk+ print support - request for feedback
Alexander Larsson wrote: locale_data = localeconv (); decimal_point = locale_data-decimal_point; ... val = strtod (nptr, fail_pos); What happens if another thread calls setlocale() after localeconv() but before strtod()? -- Marco Barisione ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Announcing: Project Ridley
Jonathan Blandford wrote: The primary goal of Project Ridley is to cut down on the number of problem libraries that are part of the GNOME platform. We propose to do this by moving functionality into GTK+, wherever it makes sense. What about EggRegex? -- Marco Barisione ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list