Re: [Pharo-users] NeoNumberParser and localization

2016-07-05 Thread Sven Van Caekenberghe

> On 05 Jul 2016, at 16:40, Peter Uhnák  wrote:
> 
> I know that only NeoCSV uses it — that's how I ran into this problem. I was 
> processing some (czech) CSV files which used the decimal comma separator… 
> however the numbers were silently truncated, which wasn't nice to say the 
> least — I really don't understand why the default behavior is to silently 
> change the value, and not produce an error — this also applies to Pharo's 
> number parser.
> 
> BTW, you not only need to set the thousands separator, but the decimal 
> separator too, I guess.
> 
> depending on the default values, but that's really not the main point
> 
> Now, I can understand where/how your suggestions would make sense. Maybe you 
> can try subclassing and make your own variant (first) ?
> 
> Well I would need a way to configure the CSV parser. Because I am certainly 
> not interested in manually transforming every float field. I want just 
> configure it at one place and use the regular addFloatField — after all the 
> file is going to be consistent in it's format.
> 
> Btw there are other options for improvement, like configuring the default 
> date field and then having addDateField, etc. But maybe that's just 
> overloading the NeoCSV parser… in any case it's a food for thought.

Indeed, I do not want to overload the CSV parser, it is pretty simple right now.

The conversions are all in the convenience protocol for a reason: they just 
save you some typing. You really ought to do your own conversions, when you 
need to.

  parser addFieldConverter: [ :string | MyNumberParser parse: string ]

There are too many formats out there (especially for dates/times).

You are right about truncation and error handling. But parsing and enforcing a 
syntax are two different things. That is why I think the thousands separator 
option is not that simple, consider

  1,000.00
  10,00.00
  1,0,0,0.00
  1,000.00E1000,000

You see ? One quick and dirty solution would be to just remove $, or replace 
one character by another.

> Peter
> 
> 
> On Tue, Jul 5, 2016 at 2:34 PM, Sven Van Caekenberghe  wrote:
> Peter,
> 
> NeoNumberParser is a simple number (integer/float) parser that is part of 
> NeoCSV (it was based on the JSON number parsing code). It was added because I 
> wanted a number parser that makes little demands on the stream it parses from 
> (just 1 character peek ahead, no arbitrary backtracking, limited API). It was 
> not meant to be very powerful.
> 
> If you check the references, you see that where it is used in NeoCSVReader, 
> you could easily substitute another parser.
> 
> Now, I can understand where/how your suggestions would make sense. Maybe you 
> can try subclassing and make your own variant (first) ? BTW, you not only 
> need to set the thousands separator, but the decimal separator too, I guess.
> 
> Sven
> 
> > On 05 Jul 2016, at 14:17, Peter Uhnák  wrote:
> >
> > Hi,
> >
> > is there any plan for NeoNumberParser do add localization support?
> >
> > e.g.
> >
> > NeoNumberParser new
> > thousandsSeparator: $,; "common in us data"
> > parse: '12,230'
> >
> > =>
> >
> > 12230
> >
> > NeoNumberParser new
> > decimalSeparator: $,; "common in eu data"
> > parse: '12,230'
> >
> > =>
> >
> > 12.230
> >
> > Thanks,
> > Peter
> 
> 
> 




Re: [Pharo-users] NeoNumberParser and localization

2016-07-05 Thread Peter Uhnák
I know that only NeoCSV uses it — that's how I ran into this problem. I was
processing some (czech) CSV files which used the decimal comma separator…
however the numbers were silently truncated, which wasn't nice to say the
least — I really don't understand why the default behavior is to silently
change the value, and not produce an error — this also applies to Pharo's
number parser.

BTW, you not only need to set the thousands separator, but the decimal
> separator too, I guess.


depending on the default values, but that's really not the main point

Now, I can understand where/how your suggestions would make sense. Maybe
> you can try subclassing and make your own variant (first) ?


Well I would need a way to configure the CSV parser. Because I am certainly
not interested in manually transforming every float field. I want just
configure it at one place and use the regular addFloatField — after all the
file is going to be consistent in it's format.

Btw there are other options for improvement, like configuring the default
date field and then having addDateField, etc. But maybe that's just
overloading the NeoCSV parser… in any case it's a food for thought.

Peter


On Tue, Jul 5, 2016 at 2:34 PM, Sven Van Caekenberghe  wrote:

> Peter,
>
> NeoNumberParser is a simple number (integer/float) parser that is part of
> NeoCSV (it was based on the JSON number parsing code). It was added because
> I wanted a number parser that makes little demands on the stream it parses
> from (just 1 character peek ahead, no arbitrary backtracking, limited API).
> It was not meant to be very powerful.
>
> If you check the references, you see that where it is used in
> NeoCSVReader, you could easily substitute another parser.
>
> Now, I can understand where/how your suggestions would make sense. Maybe
> you can try subclassing and make your own variant (first) ? BTW, you not
> only need to set the thousands separator, but the decimal separator too, I
> guess.
>
> Sven
>
> > On 05 Jul 2016, at 14:17, Peter Uhnák  wrote:
> >
> > Hi,
> >
> > is there any plan for NeoNumberParser do add localization support?
> >
> > e.g.
> >
> > NeoNumberParser new
> > thousandsSeparator: $,; "common in us data"
> > parse: '12,230'
> >
> > =>
> >
> > 12230
> >
> > NeoNumberParser new
> > decimalSeparator: $,; "common in eu data"
> > parse: '12,230'
> >
> > =>
> >
> > 12.230
> >
> > Thanks,
> > Peter
>
>
>


Re: [Pharo-users] NeoNumberParser and localization

2016-07-05 Thread Sven Van Caekenberghe
Peter,

NeoNumberParser is a simple number (integer/float) parser that is part of 
NeoCSV (it was based on the JSON number parsing code). It was added because I 
wanted a number parser that makes little demands on the stream it parses from 
(just 1 character peek ahead, no arbitrary backtracking, limited API). It was 
not meant to be very powerful.

If you check the references, you see that where it is used in NeoCSVReader, you 
could easily substitute another parser.

Now, I can understand where/how your suggestions would make sense. Maybe you 
can try subclassing and make your own variant (first) ? BTW, you not only need 
to set the thousands separator, but the decimal separator too, I guess.

Sven

> On 05 Jul 2016, at 14:17, Peter Uhnák  wrote:
> 
> Hi,
> 
> is there any plan for NeoNumberParser do add localization support?
> 
> e.g.
> 
> NeoNumberParser new
> thousandsSeparator: $,; "common in us data"
> parse: '12,230'
> 
> => 
> 
> 12230
> 
> NeoNumberParser new
> decimalSeparator: $,; "common in eu data"
> parse: '12,230'
> 
> => 
> 
> 12.230
> 
> Thanks,
> Peter




[Pharo-users] NeoNumberParser and localization

2016-07-05 Thread Peter Uhnák
Hi,

is there any plan for NeoNumberParser do add localization support?

e.g.

NeoNumberParser new
thousandsSeparator: $,; "common in us data"
parse: '12,230'

=>

12230

NeoNumberParser new
decimalSeparator: $,; "common in eu data"
parse: '12,230'

=>

12.230

Thanks,
Peter