On Thu, 19 Nov 2009 12:23:17 +0100
Alexander Prinsier wrote:
> On 11/19/2009 12:11 PM, Paul Cockings wrote:
> > This sounds like a topic for 4.0 something or later.
>
> Yeah it's not for the current release :) No worries ;)
>
Paul is hot getting 3.9.0 out. :)
btw: When do we consider 3.9.0 to
On Thu, 19 Nov 2009 11:11:13 +
Paul Cockings wrote:
> Alexander Prinsier wrote:
> >
> > Well Chinese people just list all western languages as spam... Not many
> > people speak both categories. Anyway, at least that's how I do it now,
> > and how many people could probably do it :)
> >
> >
On Thu, 19 Nov 2009 12:05:50 +0100
Alexander Prinsier wrote:
> Well Chinese people just list all western languages as spam... Not many
> people speak both categories. Anyway, at least that's how I do it now,
> and how many people could probably do it :)
>
> So ok, dspam is used in Asia :) When
On 11/19/2009 12:11 PM, Paul Cockings wrote:
> This sounds like a topic for 4.0 something or later.
Yeah it's not for the current release :) No worries ;)
Alexander
--
Let Crystal Reports handle the reporting - Free Crys
Alexander Prinsier wrote:
Well Chinese people just list all western languages as spam... Not many
people speak both categories. Anyway, at least that's how I do it now,
and how many people could probably do it :)
So ok, dspam is used in Asia :) When I find time I'll take a look at ICU.
Alex
On 11/19/2009 11:58 AM, Stevan Bajić wrote:
>>> Alexander. What would you say about adding ICU and that character handling
>>> into DSPAM? You seem to be capable to do it. Would be a nice thing to do. I
>>> would not mind if you would take that task :)
>>
>> I think the performance penalty would
On Thu, 19 Nov 2009 11:12:33 +0100
Alexander Prinsier wrote:
> On 11/19/2009 01:45 AM, Stevan Bajić wrote:
> There's also IBM's ITU (open source library) if you need something
> heavier.
>
> > Alexander. What would you say about adding ICU and that character handling
> > into DSPAM?
On 11/19/2009 01:45 AM, Stevan Bajić wrote:
There's also IBM's ITU (open source library) if you need something heavier.
> Alexander. What would you say about adding ICU and that character handling
> into DSPAM? You seem to be capable to do it. Would be a nice thing to do. I
> would not mind
On Thu, 19 Nov 2009 01:08:38 +0100
Alexander Prinsier wrote:
> >> There's also IBM's ITU (open source library) if you need something heavier.
> >>
Alexander. What would you say about adding ICU and that character handling into
DSPAM? You seem to be capable to do it. Would be a nice thing to do.
On Thu, 19 Nov 2009 01:08:38 +0100
Alexander Prinsier wrote:
> On 11/18/2009 10:30 PM, Stevan Bajić wrote:
> >> So you mean, you can break cyrillic/slavic at spaces too like Western
> >> languages? So then it'll work? You just break everything you know at
> >> spaces, and what you don't know, lik
On 11/18/2009 10:30 PM, Stevan Bajić wrote:
>> So you mean, you can break cyrillic/slavic at spaces too like Western
>> languages? So then it'll work? You just break everything you know at
>> spaces, and what you don't know, like Chinese, at UTF32 code points.
>>
> No. I did not say that. You said:
On Wed, 18 Nov 2009 22:18:40 +0100
Alexander Prinsier wrote:
> On 11/18/2009 09:53 PM, Stevan Bajić wrote:
> >> Then do what you used to do for Western languages: tokenize using spaces
> >> as separators. For other languages split every 4 bytes.
> >>
> > Not going to work. You see my name? It's s
On Wed, 18 Nov 2009 15:16:44 -0600
Kenneth Marshall wrote:
> On Wed, Nov 18, 2009 at 10:09:16PM +0100, Stevan Baji?? wrote:
> > On Wed, 18 Nov 2009 14:29:53 -0600
> > Kenneth Marshall wrote:
> >
> > > > Alexander
> > > >
> > > I thought that UTF8, UTF-16 and UTF-32 can represent all the charac
On 11/18/2009 09:53 PM, Stevan Bajić wrote:
>> Then do what you used to do for Western languages: tokenize using spaces
>> as separators. For other languages split every 4 bytes.
>>
> Not going to work. You see my name? It's slavic. And I am able to write in
> other languages (9 of them) and in ot
On Wed, Nov 18, 2009 at 10:09:16PM +0100, Stevan Baji?? wrote:
> On Wed, 18 Nov 2009 14:29:53 -0600
> Kenneth Marshall wrote:
>
> > > Alexander
> > >
> > I thought that UTF8, UTF-16 and UTF-32 can represent all the characters.
> > In that case, why wouldn't you use the UTF8 equivalent? At the le
On Wed, 18 Nov 2009 14:29:53 -0600
Kenneth Marshall wrote:
> > Alexander
> >
> I thought that UTF8, UTF-16 and UTF-32 can represent all the characters.
> In that case, why wouldn't you use the UTF8 equivalent? At the least it
> would save space.
>
* UTF-16 and UTF-32 are not widely used.
* Wron
On Wed, 18 Nov 2009 21:20:58 +0100
Alexander Prinsier wrote:
> Hello,
>
Hallo Alexander,
> I'm separating the discussion about handling non-Western languages here.
>
> One solution, which is what is used by for example xml parsers, and
> other kinds of software which want to do the right thi
On Wed, Nov 18, 2009 at 09:20:58PM +0100, Alexander Prinsier wrote:
> Hello,
>
> I'm separating the discussion about handling non-Western languages here.
>
> One solution, which is what is used by for example xml parsers, and
> other kinds of software which want to do the right thing (tm) at all
Hello,
I'm separating the discussion about handling non-Western languages here.
One solution, which is what is used by for example xml parsers, and
other kinds of software which want to do the right thing (tm) at all
costs, is:
Read in the message, using it's encoding-type. Html, Xml, but also
19 matches
Mail list logo