Re: translation of searchtmpl/search.wml
On Tue, Jan 21, 2003 at 08:34:18AM +0900, Tomohiro KUBOTA wrote: > Not yet. However, I believe that --with-extra-charsets is a necessary > condition, though I am not sure that it is a necessary and sufficient > condition (but I expect so). Please read: I have also received your bug reports about the extra charsets, the next version will take that into account. I've got some more problematic bugs with mnogosearch that need fixing first. > I thought about testing it but I don't have enough time to study database, > because I am entirely new on database. (Also, I could not test another > etc/index..htm problem because search.cgi in klecker:~/public_html > didn't work well and I don't know why. It may be because of apache's > configuration.) UID problems? Postgresql complains you're not user www-data? If something is in ~/public_html then it runs as that user, not www-data - Craig -- Craig Small VK2XLZ GnuPG:1C1B D893 1418 2AF4 45EE 95CB C76C E5AC 12CA DFA5 Eye-Net Consulting http://www.enc.com.au/<[EMAIL PROTECTED]> MIEEE <[EMAIL PROTECTED]> Debian developer <[EMAIL PROTECTED]>
Re: translation of searchtmpl/search.wml
Hi, From: [EMAIL PROTECTED] (Craig Small) Subject: Re: translation of searchtmpl/search.wml Date: Tue, 21 Jan 2003 09:13:57 +1100 > It was before you replied to that mail (or maybe another?). Essentially > I was incorrect. mnogosearch *stores* its indexing in UTF-8 but needs > all those flags to do the indexing. > > Have you tried a little bit of indexing of Japanese pages to see if it > does seem to behave itself? Not yet. However, I believe that --with-extra-charsets is a necessary condition, though I am not sure that it is a necessary and sufficient condition (but I expect so). Please read: http://www.mnogosearch.org/board/message.php?id=6350 Note that Japanese (and Chinese) has "The Problem 2" (no spaces between words) and need the newer version of mnoGoSearch with ChaSen, as you wrote. However, I expect Korean will be fully fixed by --with-extra-charsets. I also expect that Japanese and Chinese words which occasionally appear independently (i.e., separated by spaces or HTML tags) will be able to be searched. I thought about testing it but I don't have enough time to study database, because I am entirely new on database. (Also, I could not test another etc/index..htm problem because search.cgi in klecker:~/public_html didn't work well and I don't know why. It may be because of apache's configuration.) --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/
Re: translation of searchtmpl/search.wml
On Sun, Jan 19, 2003 at 11:45:17PM +0900, Tomohiro KUBOTA wrote: > > PS. The content negotiation > http://search.debian.org/new/index.en.cgi > http://search.debian.org/new/index.fr.cgi > seems not work well. Though I am afraid I am wrong, how about > renaming /org/search.debian.org/etc/search..htm into > renaming /org/search.debian.org/etc/index..htm ? The source > code of mnoGoSearch seems to substitute ".cgi" with ".htm" to > search the configuration file (src/search.c). That's right, the /new/ stuff was to test other things. The files eventually become index..htm The problem is you need to, currently, manually edit them before they are placed in the etc directory. - Craig -- Craig Small VK2XLZ GnuPG:1C1B D893 1418 2AF4 45EE 95CB C76C E5AC 12CA DFA5 Eye-Net Consulting http://www.enc.com.au/<[EMAIL PROTECTED]> MIEEE <[EMAIL PROTECTED]> Debian developer <[EMAIL PROTECTED]>
Re: translation of searchtmpl/search.wml
On Mon, Jan 20, 2003 at 06:22:08PM +0900, Tomohiro KUBOTA wrote: > > BTW, do you have any idea why Craig thinks like the following mail? > Maybe I am missing something > http://lists.debian.org/debian-www/2003/debian-www-200301/msg00271.html It was before you replied to that mail (or maybe another?). Essentially I was incorrect. mnogosearch *stores* its indexing in UTF-8 but needs all those flags to do the indexing. Have you tried a little bit of indexing of Japanese pages to see if it does seem to behave itself? - Craig -- Craig Small VK2XLZ GnuPG:1C1B D893 1418 2AF4 45EE 95CB C76C E5AC 12CA DFA5 Eye-Net Consulting http://www.enc.com.au/<[EMAIL PROTECTED]> MIEEE <[EMAIL PROTECTED]> Debian developer <[EMAIL PROTECTED]>
Re: translation of searchtmpl/search.wml
On Mon, Jan 20, 2003 at 06:22:08PM +0900, Tomohiro KUBOTA wrote: > Hi, > > From: [EMAIL PROTECTED] (Denis Barbier) > Subject: Re: translation of searchtmpl/search.wml > Date: Sun, 19 Jan 2003 21:46:34 +0100 > > > At first glance it sounds very good, but I am not sure this is the way > > to go, because some strings are not handled by gettext, e.g. see > > Catalan strings in webwml/english/template/debian/ctime.wml > > There are also several Perl variables in date.pot, which will be > > displayed according to current locale, and not UTF-8. > > We could certainly play with CUR_LOCALE, but a simpler solution is > > to post-process HTML files with iconv and change their charset > > field in tags, see attached patch. > > I think your idea is better than mine. I also checked your patch > works well, i.e., builds translated pages in UTF-8. All right, it is committed. > BTW, do you have any idea why Craig thinks like the following mail? > Maybe I am missing something > http://lists.debian.org/debian-www/2003/debian-www-200301/msg00271.html No idea, it seems quite clear. You could compile and install it in a private area, then index some files to check that it works as expected. When you are sure it works, explain again why it has to be recompiled, by providing examples with your own version. Denis
Re: translation of searchtmpl/search.wml
Hi, From: [EMAIL PROTECTED] (Denis Barbier) Subject: Re: translation of searchtmpl/search.wml Date: Sun, 19 Jan 2003 21:46:34 +0100 > At first glance it sounds very good, but I am not sure this is the way > to go, because some strings are not handled by gettext, e.g. see > Catalan strings in webwml/english/template/debian/ctime.wml > There are also several Perl variables in date.pot, which will be > displayed according to current locale, and not UTF-8. > We could certainly play with CUR_LOCALE, but a simpler solution is > to post-process HTML files with iconv and change their charset > field in tags, see attached patch. I think your idea is better than mine. I also checked your patch works well, i.e., builds translated pages in UTF-8. BTW, do you have any idea why Craig thinks like the following mail? Maybe I am missing something http://lists.debian.org/debian-www/2003/debian-www-200301/msg00271.html --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/
Re: translation of searchtmpl/search.wml
On Sun, Jan 19, 2003 at 11:45:17PM +0900, Tomohiro KUBOTA wrote: [..] > I found that target encoding of gettext is defined in > webwml/english/template/debian/common_tags.wml . Thus, the best way > is to redefine CHARSET_WML and CHARSET variables before it. (These > variables are defined in webwml//.wmlrc files.) The patch > attached to this mail includes this modification. > It is also needed that translated search.wml files to be written in > UTF-8. The following patch includes a note on this point. At first glance it sounds very good, but I am not sure this is the way to go, because some strings are not handled by gettext, e.g. see Catalan strings in webwml/english/template/debian/ctime.wml There are also several Perl variables in date.pot, which will be displayed according to current locale, and not UTF-8. We could certainly play with CUR_LOCALE, but a simpler solution is to post-process HTML files with iconv and change their charset field in tags, see attached patch. > PS. The content negotiation > http://search.debian.org/new/index.en.cgi > http://search.debian.org/new/index.fr.cgi > seems not work well. Though I am afraid I am wrong, how about > renaming /org/search.debian.org/etc/search..htm into > renaming /org/search.debian.org/etc/index..htm ? The source > code of mnoGoSearch seems to substitute ".cgi" with ".htm" to > search the configuration file (src/search.c). No idea about this one. Denis Index: english/searchtmpl/Makefile === RCS file: /cvs/webwml/webwml/english/searchtmpl/Makefile,v retrieving revision 1.5 diff -u -r1.5 Makefile --- english/searchtmpl/Makefile 2 Nov 2002 23:36:01 - 1.5 +++ english/searchtmpl/Makefile 19 Jan 2003 20:45:33 - @@ -10,8 +10,13 @@ include $(WMLBASE)/Make.lang +all:: search-convert search.$(LANGUAGE).html: search.wml $(ENGLISHSRCDIR)/searchtmpl/search.data \ $(ENGLISHSRCDIR)/searchtmpl/search.def $(TEMPLDIR)/common_translation.wml \ $(TEMPLDIR)/basic.wml $(TEMPLDIR)/languages.wml $(TEMPLDIR)/footer.wml \ $(GETTEXTDEP) + +search-convert: search.$(LANGUAGE).html + @c=`grep '^' $? | sed -e 's/^/\1/'`; \ + iconv -f $$c -t UTF-8 $? | sed -e 's///' > $?.tmp && mv $?.tmp $? Index: english/searchtmpl/search.data === RCS file: /cvs/webwml/webwml/english/searchtmpl/search.data,v retrieving revision 1.30 diff -u -r1.30 search.data --- english/searchtmpl/search.data 30 Dec 2002 03:26:24 - 1.30 +++ english/searchtmpl/search.data 19 Jan 2003 20:45:33 - @@ -62,7 +62,6 @@ --> -$(CHARSET=UTF-8) $(CHARSET_WML=UTF-8) #use wml::debian::common_translation HOME="http://www.debian.org"; $(title=) #use wml::debian::languages