Re: Ned Batchelder's hyphenate
On Jul 13, 10:54 am, Christopher Lenz <[EMAIL PROTECTED]> wrote: > Um, you do realize that you don't normally need any kind of > hyphenation in web applications? All depends on what you mean by 'web applications'. This is defiantly a good feature for mobile devices. Also there are many filters currently in default filters which have no use whatsoever for web services. wordwrap, center, and rjust have no purpose either. I would argue that hyphenation is more useful than those, but again I am biased. At this point it looks like I will start it as a separate app, and then if enough people find it useful, revisit the matter. The problem is, as I have stated before, due to NDA/NC I can not work on the natural language or translation aspects of it, only integration. The pay job wins. -Doug --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
Am 11.07.2007 um 07:49 schrieb [EMAIL PROTECTED]: > I welcome any and all help!!! I don't see this as anything crucial or > time sensitive. > For my part I want this feature for a project, and at worst can > include the code as an app there. I just believe it should be part of > the django distribution instead of some third party addon. Um, you do realize that you don't normally need any kind of hyphenation in web applications? Personally, while I love that Ned's coded this thing because I happen to need it for one of my current projects, I don't see it as a need common enough for inclusion in Django, or even a Django contrib thing. If you want to build on Ned's work and extend it so that it's properly internationalized and all that, just make it a separate project. Most of the time, the projects where you may need hyphenation aren't even web applications, so including the functionality in Django would be rather inconvenient for those. And doing this stuff as a separate project does not in any way preclude using it in a Django app. Just my 2 cents. Cheers, Chris -- Christopher Lenz cmlenz at gmx.de http://www.cmlenz.net/ --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
On Jul 10, 11:12 pm, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote: > Don't decide that this hinges on "fully internationalize humanize or it > shouldn't go there". Incremental changes are good. agreed. > > > There are four reasons why I feel it is better to have this as part of > > the core: > > 1. Hyphenation is a media standard and crucial for non-html templates. > > Sites which want to generate printable PDF's of say conference > > programs, or in a standard news media style will want this as much as > > they want pluralize, widthratio, rjust, and center. This is more than > > a template filter, but is a text utility. > > Not seeing why that does or doesn't support your argument. It's not > something you need all the time (more appropriate to print layout than > HTML, as a rule), so including it by default, given that HTML output is > the common case, isn't a requirement (and saves on memory usage when its > not included, for example). Having it in contrib/ puts it exactly one > import away. wordwrap, center, rjust, and widthratio (for most uses) are more appropriate for print layout than HTML. The proper way to implement these in HTML is with CSS, yet they are all part of the existing default filters. When it comes to the templates, this is just a specialized form of wordwrap. If the argument is that this is more for printed forms and not of real general use for the most common html generation and take up memory and adds bloat, then I question the inclusion of these other filters, django.utils.text.wrap and other utilities as well. At least that was my point (and admittedly a weak one :-) The astute will notice that I left off my fourth argument (it was just too weak). > > I will need help with the internationalization parts. I do not > > have enough experience with the i18n system to make the proper > > architectural decisions. > > I was thinking about this a bit yesterday. It shouldn't be too hard. I'm > a few days away from implementing anything, since I'm not going to > instantly bump this to the top of my list, but it's a solvable problem. I welcome any and all help!!! I don't see this as anything crucial or time sensitive. For my part I want this feature for a project, and at worst can include the code as an app there. I just believe it should be part of the django distribution instead of some third party addon. -Doug --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
On Tue, 2007-07-10 at 18:26 +, [EMAIL PROTECTED] wrote: > > On Jul 9, 10:48 pm, "Jacob Kaplan-Moss" <[EMAIL PROTECTED]> > wrote: > > Maybe an addition to django.contrib.humanize? > > If we decide to only support English, then I am fine with including > this as part of django.contrib.humanize. > If we decide to properly internationalize humanize, then I am fine > with that as well. (you don't use commas in German, you use periods > for instance). Don't decide that this hinges on "fully internationalize humanize or it shouldn't go there". Incremental changes are good. > There are four reasons why I feel it is better to have this as part of > the core: > 1. Hyphenation is a media standard and crucial for non-html templates. > Sites which want to generate printable PDF's of say conference > programs, or in a standard news media style will want this as much as > they want pluralize, widthratio, rjust, and center. This is more than > a template filter, but is a text utility. Not seeing why that does or doesn't support your argument. It's not something you need all the time (more appropriate to print layout than HTML, as a rule), so including it by default, given that HTML output is the common case, isn't a requirement (and saves on memory usage when its not included, for example). Having it in contrib/ puts it exactly one import away. > 2. reduce duplication of code and confusion > The actual code being duplicated is extremely minimal, but having two > text wrappers in very different locations is confusing to both > developers and users. For template filters, it would be better to have > them documented together. > > 3. Internationalization > To properly implement this we need to integrate with the > internationalization code and have the core language developers help > with maintaining the hyphenation rules. It does not feel DRY to have a > separate internationalization system in humanize, and it does not seem > right to have sections of the core only used by a contrib module > (though this is done for admin). This is a based on a mistaken assumption, it look like. Everything in contrib is already supported by translators. However, there's another consideration here, too: it's highly unlikely that a normal translator will be able to maintain the hyphenation databases. They are very technical data structures. > > In the end if those wiser than I decide it should be in humanize I > have no problem changing the patch and writing up the doc and unit > tests. I will need help with the internationalization parts. I do not > have enough experience with the i18n system to make the proper > architectural decisions. I was thinking about this a bit yesterday. It shouldn't be too hard. I'm a few days away from implementing anything, since I'm not going to instantly bump this to the top of my list, but it's a solvable problem. For my money, if we include this, putting it in contrib somewhere feels better. It will also make maintenance easier, since we can give Ned (or designated sock puppet) commit access to that part of the tree for ongoing bug fixes. Regards, Malcolm -- The early bird may get the worm, but the second mouse gets the cheese. http://www.pointy-stick.com/blog/ --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
On Jul 9, 10:48 pm, "Jacob Kaplan-Moss" <[EMAIL PROTECTED]> wrote: > Maybe an addition to django.contrib.humanize? If we decide to only support English, then I am fine with including this as part of django.contrib.humanize. If we decide to properly internationalize humanize, then I am fine with that as well. (you don't use commas in German, you use periods for instance). There are four reasons why I feel it is better to have this as part of the core: 1. Hyphenation is a media standard and crucial for non-html templates. Sites which want to generate printable PDF's of say conference programs, or in a standard news media style will want this as much as they want pluralize, widthratio, rjust, and center. This is more than a template filter, but is a text utility. 2. reduce duplication of code and confusion The actual code being duplicated is extremely minimal, but having two text wrappers in very different locations is confusing to both developers and users. For template filters, it would be better to have them documented together. 3. Internationalization To properly implement this we need to integrate with the internationalization code and have the core language developers help with maintaining the hyphenation rules. It does not feel DRY to have a separate internationalization system in humanize, and it does not seem right to have sections of the core only used by a contrib module (though this is done for admin). In the end if those wiser than I decide it should be in humanize I have no problem changing the patch and writing up the doc and unit tests. I will need help with the internationalization parts. I do not have enough experience with the i18n system to make the proper architectural decisions. For the translated text, the wrapping should use the locale middleware specified hyphenation rules. For text which has not been translated, it should use the native LANGUAGE_CODE rules. Not sure how to get that working. -Doug --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
Since the algorithm is identical to the one used by TeX, the hyphenation data can be taken from there as well. I used a TeX distribution to get the latest patterns for English to include in the module. I installed MiKTeX, and dug around in the tex/generic/hyphen directory to find them. There are also French and German patterns in that distro, and there may be other hyphenation data sets in other repositories on the web, I haven't looked. --Ned. [EMAIL PROTECTED] wrote: > On further reflection, there is a huge internationalization issue > here. The hyphenation rules and data driven exceptions are English > specific. Some will work (minimally) for other languages, but are not > good enough. Proper integration will be required, and language > developers will need to have more knowledge about this corner domain. > Due to my NDA/NC I cannot work on that part of it, but I do have a > patch almost ready for django.utils.text.wordwrap to take an optional > boolean argument to do word hyphenation. > > Thankfully it is data driven and getting the data from the .po should > not be too difficult. The problem will be getting the initial data. > > On Jul 9, 10:30 pm, "[EMAIL PROTECTED]" > <[EMAIL PROTECTED]> wrote: > >> Ned just posted the code for the tabblo hyphenate filter in the public >> domain. This should be added as a builtin django filter with proper >> attribution. I don't think wordwrap should use it by default, and >> optional arguments don't work. I was thinking of just calling it >> 'hyphenate' or 'hyphenatedwordwrap'. >> >> http://www.nedbatchelder.com/code/modules/hyphenate.html >> >> Thoughts? >> > > > > > > > -- Ned Batchelder, http://nedbatchelder.com --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
Todd, good to meet a fellow nerd: I also have the five-volume hardcover set. My code is implemented from appendix H of volume 1 (or is it volume A?). --Ned. Todd O'Bryan wrote: > On Tue, 2007-07-10 at 02:54 +, [EMAIL PROTECTED] wrote: > >> On further reflection, there is a huge internationalization issue >> here. The hyphenation rules and data driven exceptions are English >> specific. Some will work (minimally) for other languages, but are not >> good enough. Proper integration will be required, and language >> developers will need to have more knowledge about this corner domain. >> Due to my NDA/NC I cannot work on that part of it, but I do have a >> patch almost ready for django.utils.text.wordwrap to take an optional >> boolean argument to do word hyphenation. >> > > I seem to remember that Knuth did a pretty amazing job with hyphenation > in TeX, most of it was algorithmic, and there were hyphenation engines > for at least a few languages. > > I'll have to dig out my copy of The TeXbook to look (yes, I'm one of > those nerds who has the five-volume box set of TeX and MetaFont books), > but this may be something that somebody's already done a really good job > on. > > Todd > > > > > > > -- Ned Batchelder, http://nedbatchelder.com --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
On 7/10/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > Ticket with initial patch made: http://code.djangoproject.com/ticket/4821 > > It still needs documentation and unit testing (and > internationalization) but it is a start. Will try to get to the doc > and test this weekend. +1 from me. However, like Jacob said, I think it should be in contrib.humanize, not utils.text. Yours, Russ %-) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
Ticket with initial patch made: http://code.djangoproject.com/ticket/4821 It still needs documentation and unit testing (and internationalization) but it is a start. Will try to get to the doc and test this weekend. -Doug On Jul 9, 10:30 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > Ned just posted the code for the tabblo hyphenate filter in the public > domain. This should be added as a builtin django filter with proper > attribution. I don't think wordwrap should use it by default, and > optional arguments don't work. I was thinking of just calling it > 'hyphenate' or 'hyphenatedwordwrap'. > > http://www.nedbatchelder.com/code/modules/hyphenate.html > > Thoughts? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
On Jul 9, 11:59 pm, "Tom Tobin" <[EMAIL PROTECTED]> wrote: > I'm not sure what you mean by this; "public domain" means anyone can > do pretty much whatever they want with it, without restriction. I mean he wanted his code in the public domain with working data so that restricted him to data which was also in the public domain, and not LaTeX data under a BSD or other license. NOTE: this is just me making wild assumptions at this point. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
On 7/9/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > The code Ned put up contains data from the public domain and > was most likely restricted due to that. I'm not sure what you mean by this; "public domain" means anyone can do pretty much whatever they want with it, without restriction. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
On Jul 9, 11:07 pm, "Todd O'Bryan" <[EMAIL PROTECTED]> wrote: > I seem to remember that Knuth did a pretty amazing job with hyphenation > in TeX, most of it was algorithmic, and there were hyphenation engines > for at least a few languages. Ned's implementation is taken directly from this, and I hope you are correct. The code Ned put up contains data from the public domain and was most likely restricted due to that. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
On Tue, 2007-07-10 at 02:54 +, [EMAIL PROTECTED] wrote: > On further reflection, there is a huge internationalization issue > here. The hyphenation rules and data driven exceptions are English > specific. Some will work (minimally) for other languages, but are not > good enough. Proper integration will be required, and language > developers will need to have more knowledge about this corner domain. > Due to my NDA/NC I cannot work on that part of it, but I do have a > patch almost ready for django.utils.text.wordwrap to take an optional > boolean argument to do word hyphenation. I seem to remember that Knuth did a pretty amazing job with hyphenation in TeX, most of it was algorithmic, and there were hyphenation engines for at least a few languages. I'll have to dig out my copy of The TeXbook to look (yes, I'm one of those nerds who has the five-volume box set of TeX and MetaFont books), but this may be something that somebody's already done a really good job on. Todd --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
On further reflection, there is a huge internationalization issue here. The hyphenation rules and data driven exceptions are English specific. Some will work (minimally) for other languages, but are not good enough. Proper integration will be required, and language developers will need to have more knowledge about this corner domain. Due to my NDA/NC I cannot work on that part of it, but I do have a patch almost ready for django.utils.text.wordwrap to take an optional boolean argument to do word hyphenation. Thankfully it is data driven and getting the data from the .po should not be too difficult. The problem will be getting the initial data. On Jul 9, 10:30 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > Ned just posted the code for the tabblo hyphenate filter in the public > domain. This should be added as a builtin django filter with proper > attribution. I don't think wordwrap should use it by default, and > optional arguments don't work. I was thinking of just calling it > 'hyphenate' or 'hyphenatedwordwrap'. > > http://www.nedbatchelder.com/code/modules/hyphenate.html > > Thoughts? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ned Batchelder's hyphenate
On 7/9/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Ned just posted the code for the tabblo hyphenate filter in the public > domain. [snip] > Thoughts? Maybe an addition to django.contrib.humanize? Jacob --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Ned Batchelder's hyphenate
Ned just posted the code for the tabblo hyphenate filter in the public domain. This should be added as a builtin django filter with proper attribution. I don't think wordwrap should use it by default, and optional arguments don't work. I was thinking of just calling it 'hyphenate' or 'hyphenatedwordwrap'. http://www.nedbatchelder.com/code/modules/hyphenate.html Thoughts? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---