Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-19 Thread Doug Doole
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Applied clean and runs fine. Previous comments were addressed.

There's been some discussion in the thread about whether this is worth doing, 
but given that it is done it seems a bit of a moot point. So passing this off 
to the committers to make a final call.

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-17 Thread Christoph Berg
Re: Peter Eisentraut 2017-08-14 

> There are probably a bunch of Perl or Python modules that can be
> employed for this.

https://github.com/ChristophBerg/postgresql-numeral

Christoph


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-17 Thread Chris Travers
On Sun, Sep 17, 2017 at 6:43 PM, David Fetter  wrote:

> On Sat, Sep 16, 2017 at 10:42:49PM +, Douglas Doole wrote:
> > Oliver, I took a look at your tests and they look thorough to me.
> >
> > One recommendation, instead of having 3999 separate selects to test every
> > legal roman numeral, why not just do something like this:
> >
> > do $$
> > declare
> > i int;
> > rn text;
> > rn_val int;
> > begin
> > for i in 1..3999 loop
> > rn := trim(to_char(i, 'rn'));
> > rn_val := to_number(rn, 'rn');
> > if (i <> rn_val) then
> > raise notice 'Mismatch: i=% rn=% rn_val=%', i, rn, rn_val;
> > end if;
> > end loop;
> > raise notice 'Tested roman numerals 1..3999';
> > end;
> > $$;
> >
> > It's a lot easier to maintain than separate selects.
>
> Why not just one SELECT, as in:
>
> SELECT i, to_char(i, 'rn'), to_number(to_char(i, 'rn'), 'rn');
> FROM generate_series(1,3999) i
>

Question:  What is our definition of a legal Roman numeral?

For example sometimes IXX appears in the corpus to refer to 19 even though
our standardised notation would be XIX.

>
> Best,
> David.
> --
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-17 Thread David Fetter
On Sat, Sep 16, 2017 at 10:42:49PM +, Douglas Doole wrote:
> Oliver, I took a look at your tests and they look thorough to me.
> 
> One recommendation, instead of having 3999 separate selects to test every
> legal roman numeral, why not just do something like this:
> 
> do $$
> declare
> i int;
> rn text;
> rn_val int;
> begin
> for i in 1..3999 loop
> rn := trim(to_char(i, 'rn'));
> rn_val := to_number(rn, 'rn');
> if (i <> rn_val) then
> raise notice 'Mismatch: i=% rn=% rn_val=%', i, rn, rn_val;
> end if;
> end loop;
> raise notice 'Tested roman numerals 1..3999';
> end;
> $$;
> 
> It's a lot easier to maintain than separate selects.

Why not just one SELECT, as in:

SELECT i, to_char(i, 'rn'), to_number(to_char(i, 'rn'), 'rn');
FROM generate_series(1,3999) i

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-17 Thread Oleg Bartunov
On 3 Aug 2017 16:29, "Oliver Ford"  wrote:

Adds to the to_number() function the ability to convert Roman numerals
to a number. This feature is on the formatting.c TODO list. It is not
currently implemented in either Oracle, MSSQL or MySQL so gives
PostgreSQL an edge :-)


I see use of this in full text search as a dictionary. It's useful for
indexing and searching historical documents. Probably, better to have as
contrib.


==Usage==

Call: to_number(numerals, 'RN') or to_number(numerals, 'rn').

Example: to_number('MMMXIII', 'RN') returns 3013. to_number('xiv',
'rn') returns 14.

The function is case insensitive for the numerals. It accepts a mix of
cases and treats them the same. So  to_number ('MCI, 'rn') and
to_number ('McI', 'RN') both return 1101. The format mask must however
be either 'RN' or 'rn'. If there are other elements in the mask, those
other elements will be ignored. So to_number('MMM', 'FMRN') returns
3000.

Whitespace before the numerals is ignored.

==Validation==

The new function roman_to_int() in formatting.c performs the
conversion. It strictly validates the numerals based on the following
Roman-to-Arabic conversion rules:

1. The power-of-ten numerals (I, X, C, M) can be repeated up to three
times in a row. The beginning-with-5 numerals (V, L, D) can each
appear only once.

2. Subtraction from a power-of-ten numeral cannot occur if a
beginning-with-5 numeral appears later.

3. Subtraction cannot occur if the smaller numeral is less than a
tenth of the greater numeral (so IX is valid, but IC is invalid).

4. There cannot be two subtractions in a row.

5. A beginning-with-5 numeral cannot subtract.

If any of these rules are violated, an error is raised.

==Testing==

This has been tested on a Windows build of the master branch with
MinGW. The included regression tests positively test every value from
1 to 3999 (the Roman numeral max value) by calling the existing
to_char() function to get the Roman value, then converting it back to
an Arabic value. There are also negative tests for each invalid code
path and some positive mixed-case tests.

Documentation is updated to include this new feature.

==References==

http://sierra.nmsu.edu/morandi/coursematerials/RomanNumerals.html
Documents the strict Roman numeral standard.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-16 Thread Douglas Doole
I finally figured out why docs weren't building on my machine so I was able
to take a look at your doc patch too. I think it's fine.

Reading it did suggest a couple extra negative tests to confirm that when
'rn' is specified, all other elements are ignored:

select to_number('vii7', 'rn9');
select to_number('7vii', '9rn');


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-16 Thread Douglas Doole
Oliver, I took a look at your tests and they look thorough to me.

One recommendation, instead of having 3999 separate selects to test every
legal roman numeral, why not just do something like this:

do $$
declare
i int;
rn text;
rn_val int;
begin
for i in 1..3999 loop
rn := trim(to_char(i, 'rn'));
rn_val := to_number(rn, 'rn');
if (i <> rn_val) then
raise notice 'Mismatch: i=% rn=% rn_val=%', i, rn, rn_val;
end if;
end loop;
raise notice 'Tested roman numerals 1..3999';
end;
$$;

It's a lot easier to maintain than separate selects.


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-15 Thread Douglas Doole
Ah. Sorry I missed them - I'll give them a look. (Won't be able to get to
it until Saturday though.)
On Thu, Sep 14, 2017 at 10:06 PM Oliver Ford  wrote:

> I'll fix the brace, but there are two other patches in the first email for
> tests and docs. For some reason the commitfest app didn't pick them up.
>
> On Friday, 15 September 2017, Doug Doole  wrote:
>
>> The following review has been posted through the commitfest application:
>> make installcheck-world:  tested, passed
>> Implements feature:   tested, passed
>> Spec compliant:   not tested
>> Documentation:not tested
>>
>> Code looks fine, but one niggly complaint at line 146 of the patch file
>> ("while (*cp) {"). A K style brace slipped in, which doesn't match the
>> formatting of the file.
>>
>> Given that this is providing new formatting options, there should be new
>> tests added that validate the options and error handling.
>>
>> There's also the "do we want this?" debate from the discussion thread
>> that still needs to be resolved. (I don't have an opinion either way.)
>>
>> I'm sending this back to the author to address the first two issues.
>>
>> The new status of this patch is: Waiting on Author
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-14 Thread Oliver Ford
I'll fix the brace, but there are two other patches in the first email for
tests and docs. For some reason the commitfest app didn't pick them up.

On Friday, 15 September 2017, Doug Doole  wrote:

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   not tested
> Documentation:not tested
>
> Code looks fine, but one niggly complaint at line 146 of the patch file
> ("while (*cp) {"). A K style brace slipped in, which doesn't match the
> formatting of the file.
>
> Given that this is providing new formatting options, there should be new
> tests added that validate the options and error handling.
>
> There's also the "do we want this?" debate from the discussion thread that
> still needs to be resolved. (I don't have an opinion either way.)
>
> I'm sending this back to the author to address the first two issues.
>
> The new status of this patch is: Waiting on Author
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-14 Thread Doug Doole
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

Code looks fine, but one niggly complaint at line 146 of the patch file ("while 
(*cp) {"). A K style brace slipped in, which doesn't match the formatting of 
the file.

Given that this is providing new formatting options, there should be new tests 
added that validate the options and error handling.

There's also the "do we want this?" debate from the discussion thread that 
still needs to be resolved. (I don't have an opinion either way.)

I'm sending this back to the author to address the first two issues.

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-08-14 Thread Merlin Moncure
On Mon, Aug 14, 2017 at 2:48 PM, Peter Eisentraut
 wrote:
> On 8/3/17 13:45, Robert Haas wrote:
>> On Thu, Aug 3, 2017 at 9:25 AM, Oliver Ford  wrote:
>>> Adds to the to_number() function the ability to convert Roman numerals
>>> to a number. This feature is on the formatting.c TODO list. It is not
>>> currently implemented in either Oracle, MSSQL or MySQL so gives
>>> PostgreSQL an edge :-)
>> I kind of put my head in my hands when I saw this.  I'm not really
>> sure it's worth complicating the code for something that has so little
>> practical utility, but maybe other people will feel differently.
>
> I can't get excited about it.  to_number() and such usually mirror the
> Oracle implementation, so having something that is explicitly not in
> Oracle goes a bit against its mission.
>
> One of the more interesting features of to_number/to_char is that it has
> a bunch of facilities for formatting decimal points, leading/trailing
> zeros, filling in spaces and signs, and so on.  None of that applies
> naturally to Roman numerals, so there isn't a strong case for including
> that into these functions, when a separate function or module could do.

Well, doesn't that also apply to scientific notation ()?

'RN' is documented as an accepted formatting string, and nowhere does
it mention that it only works for input.  So we ought to allow for it
to be fixed or at least document that it does not work.  It's nothing
but a curio obviously, but it's kind of cool IMO.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-08-14 Thread Peter Eisentraut
On 8/3/17 13:45, Robert Haas wrote:
> On Thu, Aug 3, 2017 at 9:25 AM, Oliver Ford  wrote:
>> Adds to the to_number() function the ability to convert Roman numerals
>> to a number. This feature is on the formatting.c TODO list. It is not
>> currently implemented in either Oracle, MSSQL or MySQL so gives
>> PostgreSQL an edge :-)
> I kind of put my head in my hands when I saw this.  I'm not really
> sure it's worth complicating the code for something that has so little
> practical utility, but maybe other people will feel differently.

I can't get excited about it.  to_number() and such usually mirror the
Oracle implementation, so having something that is explicitly not in
Oracle goes a bit against its mission.

One of the more interesting features of to_number/to_char is that it has
a bunch of facilities for formatting decimal points, leading/trailing
zeros, filling in spaces and signs, and so on.  None of that applies
naturally to Roman numerals, so there isn't a strong case for including
that into these functions, when a separate function or module could do.

There are probably a bunch of Perl or Python modules that can be
employed for this.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 2:54 PM, Oliver Ford  wrote:
> The formatting.c file specifies it as a TODO, so I thought implementing it
> would be worthwhile. As there is a to_roman conversion having it the other
> way is good for completeness.

Huh, didn't know about that.  Well, I'm not direly opposed to this or
anything, just not sure that it's worth spending a lot of time on.

> The existing int_to_roman code goes up to 3999 so this patch is consistent
> with that. I could extend both to handle Unicode values for large numbers?

Well, following what the existing code does is usually a good place to
start -- whether you want to try to extend it is up to you.  I'm not
very interested in Roman numeral handling personally, so you might
want to wait for some opinions from others.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-08-03 Thread Oliver Ford
On Thursday, 3 August 2017, Robert Haas  wrote:

> On Thu, Aug 3, 2017 at 9:25 AM, Oliver Ford  > wrote:
> > Adds to the to_number() function the ability to convert Roman numerals
> > to a number. This feature is on the formatting.c TODO list. It is not
> > currently implemented in either Oracle, MSSQL or MySQL so gives
> > PostgreSQL an edge :-)
>
> I kind of put my head in my hands when I saw this.  I'm not really
> sure it's worth complicating the code for something that has so little
> practical utility, but maybe other people will feel differently.  I
> can't deny the profound advantages associated with having a leg up on
> Oracle.


The formatting.c file specifies it as a TODO, so I thought implementing it
would be worthwhile. As there is a to_roman conversion having it the other
way is good for completeness.


>
> The error reporting is a little wonky, although maybe no wonkier than
> anything else about these conversion routines.
>
> rhaas=# select to_number('q', 'rn');
> ERROR:  invalid character "q"
>
> (hmm, no position)
>
> rhaas=# select to_number('dd', 'rn');
> ERROR:  invalid character "D" at position 1
>
> (now i get a position, but it's not really the right position; and the
> problem isn't really that the character is invalid but that you don't
> like me including it twice, and I said 'd' not 'D')
>
> rhaas=# select to_number('à', 'rn');
> ERROR:  invalid character "?"
>
> (eh?)
>
> How much call is there for a format that can only represent values up to
> 3999?
>
>
The existing int_to_roman code goes up to 3999 so this patch is consistent
with that. I could extend both to handle Unicode values for large numbers?


> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-08-03 Thread David Fetter
On Thu, Aug 03, 2017 at 01:45:02PM -0400, Robert Haas wrote:
> On Thu, Aug 3, 2017 at 9:25 AM, Oliver Ford  wrote:
> > Adds to the to_number() function the ability to convert Roman numerals
> > to a number. This feature is on the formatting.c TODO list. It is not
> > currently implemented in either Oracle, MSSQL or MySQL so gives
> > PostgreSQL an edge :-)
> 
> I kind of put my head in my hands when I saw this.  I'm not really
> sure it's worth complicating the code for something that has so little
> practical utility, but maybe other people will feel differently.  I
> can't deny the profound advantages associated with having a leg up on
> Oracle.
> 
> The error reporting is a little wonky, although maybe no wonkier than
> anything else about these conversion routines.
> 
> rhaas=# select to_number('q', 'rn');
> ERROR:  invalid character "q"
> 
> (hmm, no position)
> 
> rhaas=# select to_number('dd', 'rn');
> ERROR:  invalid character "D" at position 1
> 
> (now i get a position, but it's not really the right position; and the
> problem isn't really that the character is invalid but that you don't
> like me including it twice, and I said 'd' not 'D')
> 
> rhaas=# select to_number('à', 'rn');
> ERROR:  invalid character "?"
> 
> (eh?)
> 
> How much call is there for a format that can only represent values up to 3999?

There are ways to represent much larger numbers, possibly bigger than
INT_MAX.  https://en.wikipedia.org/wiki/Roman_numerals#Large_numbers
https://en.wikipedia.org/wiki/Numerals_in_Unicode#Roman_numerals

As nearly as I can tell, this patch is late by 124 days.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 9:25 AM, Oliver Ford  wrote:
> Adds to the to_number() function the ability to convert Roman numerals
> to a number. This feature is on the formatting.c TODO list. It is not
> currently implemented in either Oracle, MSSQL or MySQL so gives
> PostgreSQL an edge :-)

I kind of put my head in my hands when I saw this.  I'm not really
sure it's worth complicating the code for something that has so little
practical utility, but maybe other people will feel differently.  I
can't deny the profound advantages associated with having a leg up on
Oracle.

The error reporting is a little wonky, although maybe no wonkier than
anything else about these conversion routines.

rhaas=# select to_number('q', 'rn');
ERROR:  invalid character "q"

(hmm, no position)

rhaas=# select to_number('dd', 'rn');
ERROR:  invalid character "D" at position 1

(now i get a position, but it's not really the right position; and the
problem isn't really that the character is invalid but that you don't
like me including it twice, and I said 'd' not 'D')

rhaas=# select to_number('à', 'rn');
ERROR:  invalid character "?"

(eh?)

How much call is there for a format that can only represent values up to 3999?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers