On Thu, 19 Nov 2009 12:23:17 +0100
Alexander Prinsier wrote:
> On 11/19/2009 12:11 PM, Paul Cockings wrote:
> > This sounds like a topic for 4.0 something or later.
>
> Yeah it's not for the current release :) No worries ;)
>
Paul is hot getting 3.9.0 out. :)
btw: When do we consider 3.9.0 to
On Thu, 19 Nov 2009 11:11:13 +
Paul Cockings wrote:
> Alexander Prinsier wrote:
> >
> > Well Chinese people just list all western languages as spam... Not many
> > people speak both categories. Anyway, at least that's how I do it now,
> > and how many people could probably do it :)
> >
> >
On Thu, 19 Nov 2009 12:05:50 +0100
Alexander Prinsier wrote:
> Well Chinese people just list all western languages as spam... Not many
> people speak both categories. Anyway, at least that's how I do it now,
> and how many people could probably do it :)
>
> So ok, dspam is used in Asia :) When
On 11/19/2009 12:11 PM, Paul Cockings wrote:
> This sounds like a topic for 4.0 something or later.
Yeah it's not for the current release :) No worries ;)
Alexander
--
Let Crystal Reports handle the reporting - Free Crys
Alexander Prinsier wrote:
Well Chinese people just list all western languages as spam... Not many
people speak both categories. Anyway, at least that's how I do it now,
and how many people could probably do it :)
So ok, dspam is used in Asia :) When I find time I'll take a look at ICU.
Alex
On 11/19/2009 11:58 AM, Stevan Bajić wrote:
>>> Alexander. What would you say about adding ICU and that character handling
>>> into DSPAM? You seem to be capable to do it. Would be a nice thing to do. I
>>> would not mind if you would take that task :)
>>
>> I think the performance penalty would
On Thu, 19 Nov 2009 11:12:33 +0100
Alexander Prinsier wrote:
> On 11/19/2009 01:45 AM, Stevan Bajić wrote:
> There's also IBM's ITU (open source library) if you need something
> heavier.
>
> > Alexander. What would you say about adding ICU and that character handling
> > into DSPAM?
On 11/19/2009 01:45 AM, Stevan Bajić wrote:
There's also IBM's ITU (open source library) if you need something heavier.
> Alexander. What would you say about adding ICU and that character handling
> into DSPAM? You seem to be capable to do it. Would be a nice thing to do. I
> would not mind
On Thu, 19 Nov 2009 01:08:38 +0100
Alexander Prinsier wrote:
> >> There's also IBM's ITU (open source library) if you need something heavier.
> >>
Alexander. What would you say about adding ICU and that character handling into
DSPAM? You seem to be capable to do it. Would be a nice thing to do.
On Thu, 19 Nov 2009 01:08:38 +0100
Alexander Prinsier wrote:
> On 11/18/2009 10:30 PM, Stevan Bajić wrote:
> >> So you mean, you can break cyrillic/slavic at spaces too like Western
> >> languages? So then it'll work? You just break everything you know at
> >> spaces, and what you don't know, lik
On 11/18/2009 10:30 PM, Stevan Bajić wrote:
>> So you mean, you can break cyrillic/slavic at spaces too like Western
>> languages? So then it'll work? You just break everything you know at
>> spaces, and what you don't know, like Chinese, at UTF32 code points.
>>
> No. I did not say that. You said:
On Wed, 18 Nov 2009 22:18:40 +0100
Alexander Prinsier wrote:
> On 11/18/2009 09:53 PM, Stevan Bajić wrote:
> >> Then do what you used to do for Western languages: tokenize using spaces
> >> as separators. For other languages split every 4 bytes.
> >>
> > Not going to work. You see my name? It's s
On Wed, 18 Nov 2009 15:16:44 -0600
Kenneth Marshall wrote:
> On Wed, Nov 18, 2009 at 10:09:16PM +0100, Stevan Baji?? wrote:
> > On Wed, 18 Nov 2009 14:29:53 -0600
> > Kenneth Marshall wrote:
> >
> > > > Alexander
> > > >
> > > I thought that UTF8, UTF-16 and UTF-32 can represent all the charac
On 11/18/2009 09:53 PM, Stevan Bajić wrote:
>> Then do what you used to do for Western languages: tokenize using spaces
>> as separators. For other languages split every 4 bytes.
>>
> Not going to work. You see my name? It's slavic. And I am able to write in
> other languages (9 of them) and in ot
On Wed, Nov 18, 2009 at 10:09:16PM +0100, Stevan Baji?? wrote:
> On Wed, 18 Nov 2009 14:29:53 -0600
> Kenneth Marshall wrote:
>
> > > Alexander
> > >
> > I thought that UTF8, UTF-16 and UTF-32 can represent all the characters.
> > In that case, why wouldn't you use the UTF8 equivalent? At the le
On Wed, 18 Nov 2009 14:29:53 -0600
Kenneth Marshall wrote:
> > Alexander
> >
> I thought that UTF8, UTF-16 and UTF-32 can represent all the characters.
> In that case, why wouldn't you use the UTF8 equivalent? At the least it
> would save space.
>
* UTF-16 and UTF-32 are not widely used.
* Wron
On Wed, 18 Nov 2009 21:20:58 +0100
Alexander Prinsier wrote:
> Hello,
>
Hallo Alexander,
> I'm separating the discussion about handling non-Western languages here.
>
> One solution, which is what is used by for example xml parsers, and
> other kinds of software which want to do the right thi
On Wed, Nov 18, 2009 at 09:20:58PM +0100, Alexander Prinsier wrote:
> Hello,
>
> I'm separating the discussion about handling non-Western languages here.
>
> One solution, which is what is used by for example xml parsers, and
> other kinds of software which want to do the right thing (tm) at all
18 matches
Mail list logo