Re: Ned Batchelder's hyphenate

2007-07-13 Thread [EMAIL PROTECTED]



On Jul 13, 10:54 am, Christopher Lenz <[EMAIL PROTECTED]> wrote:
> Um, you do realize that you don't normally need any kind of
> hyphenation in web applications?
All depends on what you mean by 'web applications'.
This is defiantly a good feature for mobile devices. Also there are
many filters currently in default filters which have no use whatsoever
for web services. wordwrap, center, and rjust have no purpose either.
I would argue that hyphenation is more useful than those, but again I
am biased.

At this point it looks like I will start it as a separate app, and
then if enough people find it useful, revisit the matter.
The problem is, as I have stated before, due to NDA/NC I can not work
on the natural language or translation aspects of it, only
integration. The pay job wins.

-Doug



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-13 Thread Christopher Lenz

Am 11.07.2007 um 07:49 schrieb [EMAIL PROTECTED]:
> I welcome any and all help!!! I don't see this as anything crucial or
> time sensitive.
> For my part I want this feature for a project, and at worst can
> include the code as an app there. I just believe it should be part of
> the django distribution instead of some third party addon.

Um, you do realize that you don't normally need any kind of  
hyphenation in web applications?

Personally, while I love that Ned's coded this thing because I happen  
to need it for one of my current projects, I don't see it as a need  
common enough for inclusion in Django, or even a Django contrib thing.

If you want to build on Ned's work and extend it so that it's  
properly internationalized and all that, just make it a separate  
project. Most of the time, the projects where you may need  
hyphenation aren't even web applications, so including the  
functionality in Django would be rather inconvenient for those. And  
doing this stuff as a separate project does not in any way preclude  
using it in a Django app.

Just my 2 cents.

Cheers,
Chris
--
Christopher Lenz
   cmlenz at gmx.de
   http://www.cmlenz.net/


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-10 Thread [EMAIL PROTECTED]



On Jul 10, 11:12 pm, Malcolm Tredinnick <[EMAIL PROTECTED]>
wrote:
> Don't decide that this hinges on "fully internationalize humanize or it
> shouldn't go there". Incremental changes are good.
agreed.

>
> > There are four reasons why I feel it is better to have this as part of
> > the core:
> > 1. Hyphenation is a media standard and crucial for non-html templates.
> > Sites which want to generate printable PDF's of say conference
> > programs, or in a standard news media style will want this as much as
> > they want pluralize, widthratio, rjust, and center. This is more than
> > a template filter, but is a text utility.
>
> Not seeing why that does or doesn't support your argument. It's not
> something you need all the time (more appropriate to print layout than
> HTML, as a rule), so including it by default, given that HTML output is
> the common case, isn't a requirement (and saves on memory usage when its
> not included, for example). Having it in contrib/ puts it exactly one
> import away.
wordwrap, center, rjust, and widthratio (for most uses) are more
appropriate for print layout than HTML. The proper way to implement
these in HTML is with CSS, yet they are all part of the existing
default filters. When it comes to the templates, this is just a
specialized form of wordwrap. If the argument is that this is more for
printed forms and not of real general use for the most common html
generation and take up memory and adds bloat, then I question the
inclusion of these other filters, django.utils.text.wrap and other
utilities as well. At least that was my point (and admittedly a weak
one :-) The astute will notice that I left off my fourth argument (it
was just too weak).

> > I will need help with the internationalization parts. I do not
> > have enough experience with the i18n system to make the proper
> > architectural decisions.
>
> I was thinking about this a bit yesterday. It shouldn't be too hard. I'm
> a few days away from implementing anything, since I'm not going to
> instantly bump this to the top of my list, but it's a solvable problem.
I welcome any and all help!!! I don't see this as anything crucial or
time sensitive.
For my part I want this feature for a project, and at worst can
include the code as an app there. I just believe it should be part of
the django distribution instead of some third party addon.

-Doug


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-10 Thread Malcolm Tredinnick

On Tue, 2007-07-10 at 18:26 +, [EMAIL PROTECTED] wrote:
> 
> On Jul 9, 10:48 pm, "Jacob Kaplan-Moss" <[EMAIL PROTECTED]>
> wrote:
> > Maybe an addition to django.contrib.humanize?
> 
> If we decide to only support English, then I am fine with including
> this as part of django.contrib.humanize.
> If we decide to properly internationalize humanize, then I am fine
> with that as well. (you don't use commas in German, you use periods
> for instance).

Don't decide that this hinges on "fully internationalize humanize or it
shouldn't go there". Incremental changes are good.

> There are four reasons why I feel it is better to have this as part of
> the core:
> 1. Hyphenation is a media standard and crucial for non-html templates.
> Sites which want to generate printable PDF's of say conference
> programs, or in a standard news media style will want this as much as
> they want pluralize, widthratio, rjust, and center. This is more than
> a template filter, but is a text utility.

Not seeing why that does or doesn't support your argument. It's not
something you need all the time (more appropriate to print layout than
HTML, as a rule), so including it by default, given that HTML output is
the common case, isn't a requirement (and saves on memory usage when its
not included, for example). Having it in contrib/ puts it exactly one
import away.

> 2. reduce duplication of code and confusion
> The actual code being duplicated is extremely minimal, but having two
> text wrappers in very different locations is confusing to both
> developers and users. For template filters, it would be better to have
> them documented together.
> 
> 3. Internationalization
> To properly implement this we need to integrate with the
> internationalization code and have the core language developers help
> with maintaining the hyphenation rules. It does not feel DRY to have a
> separate internationalization system in humanize, and it does not seem
> right to have sections of the core only used by a contrib module
> (though this is done for admin).

This is a based on a mistaken assumption, it look like. Everything in
contrib is already supported by translators.

However, there's another consideration here, too: it's highly unlikely
that a normal translator will be able to maintain the hyphenation
databases. They are very technical data structures.

> 
> In the end if those wiser than I decide it should be in humanize I
> have no problem changing the patch and writing up the doc and unit
> tests. I will need help with the internationalization parts. I do not
> have enough experience with the i18n system to make the proper
> architectural decisions.

I was thinking about this a bit yesterday. It shouldn't be too hard. I'm
a few days away from implementing anything, since I'm not going to
instantly bump this to the top of my list, but it's a solvable problem.

For my money, if we include this, putting it in contrib somewhere feels
better. It will also make maintenance easier, since we can give Ned (or
designated sock puppet) commit access to that part of the tree for
ongoing bug fixes.

Regards,
Malcolm

-- 
The early bird may get the worm, but the second mouse gets the cheese. 
http://www.pointy-stick.com/blog/


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-10 Thread [EMAIL PROTECTED]


On Jul 9, 10:48 pm, "Jacob Kaplan-Moss" <[EMAIL PROTECTED]>
wrote:
> Maybe an addition to django.contrib.humanize?

If we decide to only support English, then I am fine with including
this as part of django.contrib.humanize.
If we decide to properly internationalize humanize, then I am fine
with that as well. (you don't use commas in German, you use periods
for instance).

There are four reasons why I feel it is better to have this as part of
the core:
1. Hyphenation is a media standard and crucial for non-html templates.
Sites which want to generate printable PDF's of say conference
programs, or in a standard news media style will want this as much as
they want pluralize, widthratio, rjust, and center. This is more than
a template filter, but is a text utility.

2. reduce duplication of code and confusion
The actual code being duplicated is extremely minimal, but having two
text wrappers in very different locations is confusing to both
developers and users. For template filters, it would be better to have
them documented together.

3. Internationalization
To properly implement this we need to integrate with the
internationalization code and have the core language developers help
with maintaining the hyphenation rules. It does not feel DRY to have a
separate internationalization system in humanize, and it does not seem
right to have sections of the core only used by a contrib module
(though this is done for admin).

In the end if those wiser than I decide it should be in humanize I
have no problem changing the patch and writing up the doc and unit
tests. I will need help with the internationalization parts. I do not
have enough experience with the i18n system to make the proper
architectural decisions. For the translated text, the wrapping should
use the locale middleware specified hyphenation rules. For text which
has not been translated, it should use the native LANGUAGE_CODE rules.
Not sure how to get that working.

-Doug



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-10 Thread Ned Batchelder
Since the algorithm is identical to the one used by TeX, the hyphenation 
data can be taken from there as well.  I used a TeX distribution to get 
the latest patterns for English to include in the module.  I installed 
MiKTeX, and dug around in the tex/generic/hyphen directory to find 
them.  There are also French and German patterns in that distro, and 
there may be other hyphenation data sets in other repositories on the 
web, I haven't looked.

--Ned.

[EMAIL PROTECTED] wrote:
> On further reflection, there is a huge internationalization issue
> here. The hyphenation rules and data driven exceptions are English
> specific. Some will work (minimally) for other languages, but are not
> good enough. Proper integration will be required, and language
> developers will need to have more knowledge about this corner domain.
> Due to my NDA/NC I cannot work on that part of it, but I do have a
> patch almost ready for django.utils.text.wordwrap to take an optional
> boolean argument to do word hyphenation.
>
> Thankfully it is data driven and getting the data from the .po should
> not be too difficult. The problem will be getting the initial data.
>
> On Jul 9, 10:30 pm, "[EMAIL PROTECTED]"
> <[EMAIL PROTECTED]> wrote:
>   
>> Ned just posted the code for the tabblo hyphenate filter in the public
>> domain. This should be added as a builtin django filter with proper
>> attribution. I don't think wordwrap should use it by default, and
>> optional arguments don't work. I was thinking of just calling it
>> 'hyphenate' or 'hyphenatedwordwrap'.
>>
>> http://www.nedbatchelder.com/code/modules/hyphenate.html
>>
>> Thoughts?
>> 
>
>
> >
>
>
>   

-- 
Ned Batchelder, http://nedbatchelder.com


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-10 Thread Ned Batchelder
Todd, good to meet a fellow nerd: I also have the five-volume hardcover 
set. My code is implemented from appendix H of volume 1 (or is it volume 
A?).

--Ned.

Todd O'Bryan wrote:
> On Tue, 2007-07-10 at 02:54 +, [EMAIL PROTECTED] wrote:
>   
>> On further reflection, there is a huge internationalization issue
>> here. The hyphenation rules and data driven exceptions are English
>> specific. Some will work (minimally) for other languages, but are not
>> good enough. Proper integration will be required, and language
>> developers will need to have more knowledge about this corner domain.
>> Due to my NDA/NC I cannot work on that part of it, but I do have a
>> patch almost ready for django.utils.text.wordwrap to take an optional
>> boolean argument to do word hyphenation.
>> 
>
> I seem to remember that Knuth did a pretty amazing job with hyphenation
> in TeX, most of it was algorithmic, and there were hyphenation engines
> for at least a few languages.
>
> I'll have to dig out my copy of The TeXbook to look (yes, I'm one of
> those nerds who has the five-volume box set of TeX and MetaFont books),
> but this may be something that somebody's already done a really good job
> on.
>
> Todd
>
>
> >
>
>
>   

-- 
Ned Batchelder, http://nedbatchelder.com


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-09 Thread Russell Keith-Magee

On 7/10/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>
> Ticket with initial patch made: http://code.djangoproject.com/ticket/4821
>
> It still needs documentation and unit testing (and
> internationalization) but it is a start. Will try to get to the doc
> and test this weekend.

+1 from me.

However, like Jacob said, I think it should be in contrib.humanize,
not utils.text.

Yours,
Russ %-)

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-09 Thread [EMAIL PROTECTED]

Ticket with initial patch made: http://code.djangoproject.com/ticket/4821

It still needs documentation and unit testing (and
internationalization) but it is a start. Will try to get to the doc
and test this weekend.

-Doug

On Jul 9, 10:30 pm, "[EMAIL PROTECTED]"
<[EMAIL PROTECTED]> wrote:
> Ned just posted the code for the tabblo hyphenate filter in the public
> domain. This should be added as a builtin django filter with proper
> attribution. I don't think wordwrap should use it by default, and
> optional arguments don't work. I was thinking of just calling it
> 'hyphenate' or 'hyphenatedwordwrap'.
>
> http://www.nedbatchelder.com/code/modules/hyphenate.html
>
> Thoughts?


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-09 Thread [EMAIL PROTECTED]



On Jul 9, 11:59 pm, "Tom Tobin" <[EMAIL PROTECTED]> wrote:
> I'm not sure what you mean by this; "public domain" means anyone can
> do pretty much whatever they want with it, without restriction.
I mean he wanted his code in the public domain with working data so
that restricted him to data which was also in the public domain, and
not LaTeX data under a BSD or other license.  NOTE: this is just me
making wild assumptions at this point.



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-09 Thread Tom Tobin

On 7/9/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> The code Ned put up contains data from the public domain and
> was most likely restricted due to that.

I'm not sure what you mean by this; "public domain" means anyone can
do pretty much whatever they want with it, without restriction.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-09 Thread [EMAIL PROTECTED]



On Jul 9, 11:07 pm, "Todd O'Bryan" <[EMAIL PROTECTED]> wrote:
> I seem to remember that Knuth did a pretty amazing job with hyphenation
> in TeX, most of it was algorithmic, and there were hyphenation engines
> for at least a few languages.
Ned's implementation is taken directly from this, and I hope you are
correct. The code Ned put up contains data from the public domain and
was most likely restricted due to that.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-09 Thread Todd O'Bryan

On Tue, 2007-07-10 at 02:54 +, [EMAIL PROTECTED] wrote:
> On further reflection, there is a huge internationalization issue
> here. The hyphenation rules and data driven exceptions are English
> specific. Some will work (minimally) for other languages, but are not
> good enough. Proper integration will be required, and language
> developers will need to have more knowledge about this corner domain.
> Due to my NDA/NC I cannot work on that part of it, but I do have a
> patch almost ready for django.utils.text.wordwrap to take an optional
> boolean argument to do word hyphenation.

I seem to remember that Knuth did a pretty amazing job with hyphenation
in TeX, most of it was algorithmic, and there were hyphenation engines
for at least a few languages.

I'll have to dig out my copy of The TeXbook to look (yes, I'm one of
those nerds who has the five-volume box set of TeX and MetaFont books),
but this may be something that somebody's already done a really good job
on.

Todd


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-09 Thread [EMAIL PROTECTED]

On further reflection, there is a huge internationalization issue
here. The hyphenation rules and data driven exceptions are English
specific. Some will work (minimally) for other languages, but are not
good enough. Proper integration will be required, and language
developers will need to have more knowledge about this corner domain.
Due to my NDA/NC I cannot work on that part of it, but I do have a
patch almost ready for django.utils.text.wordwrap to take an optional
boolean argument to do word hyphenation.

Thankfully it is data driven and getting the data from the .po should
not be too difficult. The problem will be getting the initial data.

On Jul 9, 10:30 pm, "[EMAIL PROTECTED]"
<[EMAIL PROTECTED]> wrote:
> Ned just posted the code for the tabblo hyphenate filter in the public
> domain. This should be added as a builtin django filter with proper
> attribution. I don't think wordwrap should use it by default, and
> optional arguments don't work. I was thinking of just calling it
> 'hyphenate' or 'hyphenatedwordwrap'.
>
> http://www.nedbatchelder.com/code/modules/hyphenate.html
>
> Thoughts?


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Ned Batchelder's hyphenate

2007-07-09 Thread Jacob Kaplan-Moss

On 7/9/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> Ned just posted the code for the tabblo hyphenate filter in the public
> domain.
[snip]
> Thoughts?

Maybe an addition to django.contrib.humanize?

Jacob

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Ned Batchelder's hyphenate

2007-07-09 Thread [EMAIL PROTECTED]

Ned just posted the code for the tabblo hyphenate filter in the public
domain. This should be added as a builtin django filter with proper
attribution. I don't think wordwrap should use it by default, and
optional arguments don't work. I was thinking of just calling it
'hyphenate' or 'hyphenatedwordwrap'.

http://www.nedbatchelder.com/code/modules/hyphenate.html

Thoughts?


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---