Paul J. Lucas has proposed merging lp:~zorba-coders/zorba/feature-ft_module into lp:zorba.
Requested reviews: Paul J. Lucas (paul-lucas) Related bugs: Bug #944795 in Zorba: "XQDoc doesn't handle & in URLs" https://bugs.launchpad.net/zorba/+bug/944795 For more details, see: https://code.launchpad.net/~zorba-coders/zorba/feature-ft_module/+merge/105421 Documentation tweaks. -- https://code.launchpad.net/~zorba-coders/zorba/feature-ft_module/+merge/105421 Your team Zorba Coders is subscribed to branch lp:zorba.
=== modified file 'doc/zorba/ft_tokenizer.dox' --- doc/zorba/ft_tokenizer.dox 2012-05-03 12:31:51 +0000 +++ doc/zorba/ft_tokenizer.dox 2012-05-10 23:53:19 +0000 @@ -10,9 +10,12 @@ \section ft_tokenizer_tokization Tokenization -Using the -<a href="http://site.icu-project.org/">ICU library</a>, -Zorba's implementation of tokenization +By default, +Zorba uses the +<a href="http://site.icu-project.org/">ICU library</a> +for tokenization. +For Roman alphabets, +Zorba (ICU) considers only alpha-numeric sequences of characters to be part of a token; whitespace and punctuation characters are not and separate tokens. @@ -117,7 +120,9 @@ <tr> <td>\c lang</td> <td> - The language of the string. + The + <a href="http://www.w3.org/TR/xmlschema-2/#language">language</a> + of the string. </td> </tr> <tr> @@ -132,7 +137,7 @@ <tr> <td>\c callback</td> <td> - The Callback to call once per token. + The \c Callback to call once per token. </td> </tr> <tr> @@ -232,7 +237,9 @@ <tr> <td>\c languages</td> <td> - The list of languages supported by the tokenizer. + The list of + <a href="http://www.w3.org/TR/xmlschema-2/#language">languages</a> + supported by the tokenizer. </td> </tr> <tr> @@ -247,7 +254,9 @@ In addition to a \c Tokenizer, you must also implement a \c TokenizerProvider -that, given a language, provides a \c Tokenizer for that language: +that, +given a <a href="http://www.w3.org/TR/xmlschema-2/#language">language</a>, +provides a \c Tokenizer for that language: \code class TokenizerProvider { @@ -262,7 +271,11 @@ <table> <tr> <td>\c lang</td> - <td>The language to tokenize.</td> + <td> + The + <a href="http://www.w3.org/TR/xmlschema-2/#language">language</a> + to tokenize. + </td> </tr> <tr> <td>\c num</td>
-- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp