Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-05 Thread Ian Zimmerman
On 2019-04-03 11:44, Warren Young wrote:

> As for the non-ASCII characters, they're UTF-8, which is the de facto
> standard character set on the Internet since around the time of The
> Bubble.  Ignoring the embedded world, I can't think of an in-support
> OS that doesn't have built-in support for UTF-8.  The only place I'd
> caution against using such characters is in printf() output and such,
> and then only because the Windows Console defaults to UTF-16LE.

The LANG and LC_* environment variables exist for a reason.

-- 
Please don't Cc: me privately on mailing lists and Usenet,
if you also post the followup to the list or newsgroup.
To reply privately _only_ on Usenet and on broken lists
which rewrite From, fetch the TXT record for no-use.mooo.com.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-05 Thread Richard Damon
On 4/5/19 9:50 AM, Joshua Wise wrote:
> Julian dates are definitely floating point numbers, not integers.

Julian dates, if being used to represent a time on a given date would be
a floating point number. A Julian date, if only needing to express a
time to the precision of a whole Day, could be represented as an integer
(with some rule to define which of the likely integers you would chose,
due to the natural fuzziness of plain dates).

Thus Apr 5, 2019 could be expressed as an integer (and in fact to
express it as a float requires adding a lot of detail about the date,
verse maybe 1 bit of information)

-- 
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-05 Thread Joshua Wise
Julian dates are definitely floating point numbers, not integers.

On Thu, Apr 4, 2019, 3:37 PM James K. Lowden 
wrote:

> On Thu, 4 Apr 2019 11:21:41 -0400
> Joshua Wise  wrote:
>
> > > On the other hand, what table has a floating point number in its
> > > key?
> > >
> > > How do you even express the value of such a key for an exact
> > > match?
> >
> > Well I imagine it can be very useful for range queries. Imagine
> > Julian dates, coordinate points, rankings, etc.
>
> Julian dates are integers.  The tm structure is all integers, too.
>
> I suppose you could store lat/lon as floating point.  It's exactly the
> kind of data that calls out of a tm-like structure, though, because
> officially there are 60 minutes in a degree, and 60 seconds in a minute.
> Just as with time, the governing authorities use a non-decimal
> notation; decimal fractions of a degree are mere computational
> convienience.  And, again, it's not part of the key.
>
> In financial analysis, range queries over large datasets are common.  If
> it's not a range of dates, it's a range of
> returns/price/earning/capitalization over time.  Yet Microsoft SQL
> Server never suggested we use anything other than IEEE to store the
> data.  Perhaps that's because, more often than not, floating point data
> are manipulated as part of the query.
>
> If you're joining the table to itself to select price change over time
> to compute, say, variance, the absolute magnitude of the data are
> uninteresting.  You find the stocks by date, subtract the prices and
> compute the variance, in IEEE format, of course, because that's what
> the CPU supports.  Then you sort and filter the top quintile, or
> whatever.  In such a case, the overhead of floating-point conversion
> will be significant: twice for every row, overhead that is nonexistent
> today.
>
> I'm skeptical of the claimed advantage.  The downside is clear.  If the
> advantage can be shown, its use would be specialized.  OTOH, a
> compiete BCD implementation would be ... interesting.
>
> --jkl
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-05 Thread Lifepillar
Minor correction:

> On 5 Apr 2019, at 09:52, Lifepillar  wrote:
> 
> select decStr(decAdd(a,60)), case dec(b) when dec(c) then 1 else 0 end from 
> t1;
> select decStr(decAdd(a,70)), case dec(c) when dec(b) then 1 else 0 end from 
> t1;
> […]
> select count(*), count(dec(b)), decStr(decSum(b)), decStr(decAvg(b)), 
> decStr(decMin(b)), decStr(decMax(b)) from t1;

should be:

select decStr(decAdd(a,60)), case b when c then 1 else 0 end from t1;
select decStr(decAdd(a,70)), case c when b then 1 else 0 end from t1;
[…]
select count(*), count(b), decStr(decSum(b)), decStr(decAvg(b)), 
decStr(decMin(b)), decStr(decMax(b)) from t1;

Life.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-05 Thread Lifepillar
On 5 Apr 2019, at 00:18, Simon Slavin  wrote:
> 
> On 4 Apr 2019, at 10:12pm, Lifepillar  wrote:
> 
>> This is essentially a pragmatic choice, as the semantics of NULLs is 
>> unspecified and ambiguous.
> 
> The way SQL handles NULLs may sometimes appear inconsistent, but is the 
> result of SQL handling rows as sets.  Some of the behaviour is, carefully 
> designed, with the knowledge that the NULL that results from one step of an 
> expression is going to be fed into the next step of the expression.

I beg to disagree. NULL's handling in SQL *is* inconsistent. The SQL committee 
has taken some actions to make things... less worse, in particular by adding 
F571 ("is true", "is false", "is unknown"), but the castle still remains (and 
will likely forever remain) a castle of sand.

> For those purposes, NULL means one of two things: "no value" or "value 
> unknown”.

There is a third option: “no information” (value may exist and be unknown, or 
it may not exist). And before someone asks: no, there is no need for going meta 
(“don’t know whether there is no information…”).

>  For instance, the sum of a result which includes a NULL value is NULL.

Note that this is incompatible with the “value unknown” semantics (noticed in 
*1977* by Grant!). Trivial example:

select * from T where x = x;

If x contains NULLs, under the “value unknown” semantics the above query should 
still output all the rows, because an existing value, albeit unknown, is 
always, surely, equal to itself. Under the “inapplicable” semantics SQL's 
result is correct, though, and under the “no information” semantics it may be 
considered correct assuming that you want certain answers (i.e., only tuples 
that satisfy the condition with certainty). You see that NULL conflate 
different things and leaves the user the burden to deal with it.

>  But the sum of a result with zero rows is not NULL, it's zero.

So, SQL is wrong there, because it returns NULL (my library returns 0). Note 
that an empty table has no NULLs, so the result of any query or function on an 
empty set has nothing to do with NULLs.

I am fond of NULLs (as much as I try to avoid them in my databases), so if you 
feel like it is interesting discussing this any further (and deemed appropriate 
for the list), we may do so in a separate thread.

> For the purposes of a decimal extension to SQLite, I would imitate what 
> SQLite3 does with REAL values.  If you have a question about how SQLite sees 
> NULL it might be answered here:
> 
> 
> 
> It's especially important that the test script at the end of that page, 
> behaves the same if you run it as is, and if you substitute the 'int' type 
> with your decimal type.  If you have any questions, please don't hesitate to 
> ask.§

I have tried the script, adapted for decimals, and it works the same as for 
ints. The script is reported at the end of this message. Btw, talking about 
consistency:

create table t(n int);
insert into t values (1), (null);
select distinct n from t; -- Result has two rows
select count(distinct n) from t; -- 1

:)
Thanks for the feedback!
Life.

-- Create a test table with data
create table t1(a blob, b blob, c blob);
insert into t1 values (dec(1), dec(0), dec(0));
insert into t1 values (dec(2), dec(0), dec(1));
insert into t1 values (dec(3), dec(1), dec(0));
insert into t1 values (dec(4), dec(1), dec(1));
insert into t1 values (dec(5), null, dec(0));
insert into t1 values (dec(6), null, dec(1));
insert into t1 values (dec(7), null, null);
select decStr(a), decStr(b), decstr(c) from t1;

-- Check to see what CASE does with NULLs in its test expressions
select decStr(a), case when b<>dec(0) then 1 else 0 end from t1;
select decStr(decAdd(a,10)), case when not b<>dec(0) then 1 else 0 end from t1;
select decStr(decAdd(a,20)), case when b<>dec(0) and c<>dec(0) then 1 else 0 
end from t1;
select decStr(decAdd(a,30)), case when not (b<>dec(0) and c<>dec(0)) then 1 
else 0 end from t1;
select decStr(decAdd(a,40)), case when b<>dec(0) or c<>dec(0) then 1 else 0 end 
from t1;
select decStr(decAdd(a,50)), case when not (b<>dec(0) or c<>dec(0)) then 1 else 
0 end from t1;
select decStr(decAdd(a,60)), case dec(b) when dec(c) then 1 else 0 end from t1;
select decStr(decAdd(a,70)), case dec(c) when dec(b) then 1 else 0 end from t1;

-- What happens when you multiple a NULL by zero?
select decStr(decAdd(a,80)), decStr(decMul(b,0)) from t1;
select decStr(decAdd(a,90)), decStr(decMul(b,c)) from t1;

-- What happens to NULL for other operators?
select decStr(decAdd(a,100)), decStr(decAdd(b,c)) from t1;

-- Test the treatment of aggregate operators
select count(*), count(dec(b)), decStr(decSum(b)), decStr(decAvg(b)), 
decStr(decMin(b)), decStr(decMax(b)) from t1;

-- Check the behavior of NULLs in WHERE clauses
select decStr(decAdd(a,110)) from t1 where bdec(10);
select decStr(decAdd(a,130)) from t1 where bhttp://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread Simon Slavin
On 4 Apr 2019, at 10:12pm, Lifepillar  wrote:

> This is essentially a pragmatic choice, as the semantics of NULLs is 
> unspecified and ambiguous.

The way SQL handles NULLs may sometimes appear inconsistent, but is the result 
of SQL handling rows as sets.  Some of the behaviour is, carefully designed, 
with the knowledge that the NULL that results from one step of an expression is 
going to be fed into the next step of the expression.

For those purposes, NULL means one of two things: "no value" or "value 
unknown".  For instance, the sum of a result which includes a NULL value is 
NULL.  But the sum of a result with zero rows is not NULL, it's zero.

For the purposes of a decimal extension to SQLite, I would imitate what SQLite3 
does with REAL values.  If you have a question about how SQLite sees NULL it 
might be answered here:



It's especially important that the test script at the end of that page, behaves 
the same if you run it as is, and if you substitute the 'int' type with your 
decimal type.  If you have any questions, please don't hesitate to ask.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread Lifepillar
On 4 Apr 2019, at 21:36, James K. Lowden  wrote:
> 
> On Thu, 4 Apr 2019 17:30:29 +0200
> Lifepillar  wrote:
> 
>> On 4 Apr 2019, at 17:15, James K. Lowden 
>> wrote:
>>> On Wed, 3 Apr 2019 14:30:52 +0200
>>> Lifepillar  wrote:
 SQLite3 Decimal is an extension implementing exact decimal
 arithmetic for SQLite3. 
>>> 
>>> What does divide-by-zero yield?  
>> 
>> By default:
>> 
>> sqlite> select decDiv(1,0);
>> Error: Division by zero
>> 
>> You have the option to ignore the error, though, in which case you
>> get +Inf:
>> 
>> sqlite> delete from decTraps where flag = 'Division by zero';
>> sqlite> select decStr(decDiv(1,0));
>> Infinity
>> 
>> decTraps is a virtual table containing a list of flags which, when
>> set by some function, raise an error. The flags are from IEEE 754.
> 
> Nice.  Division by zero can be a problem in other DBMSs because, if it
> occurs, the query aborts, and you then have to search for the cases and
> decide what to do. By flagging and marking them, you make that task
> easier.  Especially when the query runs for a long time.

Thanks for the positive feedback!

>> NULLs are avoided where other results make sense. For instance:
>> 
>> sqlite> create table T (n blob);
>> sqlite> select decStr(decSum(n)) from T; -- Sum of an empty set of
>> sqlite> values
>> 0
>> sqlite> select decStr(decAvg(n)) from T; -- Avg. of empty set of
>> sqlite> values
>> NaN
> 
> Here, as you may know, you're in close touch with SQL's ambivalent
> ambiguity with NULL.  Should the sum of nulls be 0?  Should the count
> be 0?  Just one example of how SQL is a fossil from the 1980s.  

The examples above use an empty table, so no NULLs are present and there
is no ambiguity. In the presence of NULLs I have decided to go with SQL:
decimal aggregate functions simply ignore them and other functions, such
as decAdd() (addition), return NULL if any of their arguments is NULL
(which is consistent with SQL behaviour for + and similar operators).

This is essentially a pragmatic choice, as the semantics of NULLs is
unspecified and ambiguous. And I don’t think that it matters much, after
all: if you care about exact arithmetic (e.g., for accounting), you
should definitely avoid NULLs, at least in any numeric fields.

Life.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread James K. Lowden
On Thu, 4 Apr 2019 11:21:41 -0400
Joshua Wise  wrote:

> > On the other hand, what table has a floating point number in its
> > key?  
> > 
> > How do you even express the value of such a key for an exact
> > match?  
> 
> Well I imagine it can be very useful for range queries. Imagine
> Julian dates, coordinate points, rankings, etc.

Julian dates are integers.  The tm structure is all integers, too.  

I suppose you could store lat/lon as floating point.  It's exactly the
kind of data that calls out of a tm-like structure, though, because
officially there are 60 minutes in a degree, and 60 seconds in a minute.
Just as with time, the governing authorities use a non-decimal
notation; decimal fractions of a degree are mere computational
convienience.  And, again, it's not part of the key.  

In financial analysis, range queries over large datasets are common.  If
it's not a range of dates, it's a range of
returns/price/earning/capitalization over time.  Yet Microsoft SQL
Server never suggested we use anything other than IEEE to store the
data.  Perhaps that's because, more often than not, floating point data
are manipulated as part of the query.  

If you're joining the table to itself to select price change over time
to compute, say, variance, the absolute magnitude of the data are
uninteresting.  You find the stocks by date, subtract the prices and
compute the variance, in IEEE format, of course, because that's what
the CPU supports.  Then you sort and filter the top quintile, or
whatever.  In such a case, the overhead of floating-point conversion
will be significant: twice for every row, overhead that is nonexistent
today.  

I'm skeptical of the claimed advantage.  The downside is clear.  If the
advantage can be shown, its use would be specialized.  OTOH, a
compiete BCD implementation would be ... interesting.  

--jkl

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread James K. Lowden
On Thu, 4 Apr 2019 17:30:29 +0200
Lifepillar  wrote:

> On 4 Apr 2019, at 17:15, James K. Lowden 
> wrote:
> > On Wed, 3 Apr 2019 14:30:52 +0200
> > Lifepillar  wrote:
> >> SQLite3 Decimal is an extension implementing exact decimal
> >> arithmetic for SQLite3. 
> > 
> > What does divide-by-zero yield?  
> 
> By default:
> 
> sqlite> select decDiv(1,0);
> Error: Division by zero
> 
> You have the option to ignore the error, though, in which case you
> get +Inf:
> 
> sqlite> delete from decTraps where flag = 'Division by zero';
> sqlite> select decStr(decDiv(1,0));
> Infinity
> 
> decTraps is a virtual table containing a list of flags which, when
> set by some function, raise an error. The flags are from IEEE 754.

Nice.  Division by zero can be a problem in other DBMSs because, if it
occurs, the query aborts, and you then have to search for the cases and
decide what to do. By flagging and marking them, you make that task
easier.  Especially when the query runs for a long time.

> NULLs are avoided where other results make sense. For instance:
> 
> sqlite> create table T (n blob);
> sqlite> select decStr(decSum(n)) from T; -- Sum of an empty set of
> sqlite> values
> 0
> sqlite> select decStr(decAvg(n)) from T; -- Avg. of empty set of
> sqlite> values
> NaN

Here, as you may know, you're in close touch with SQL's ambivalent
ambiguity with NULL.  Should the sum of nulls be 0?  Should the count
be 0?  Just one example of how SQL is a fossil from the 1980s.  

--jkl
 

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread Lifepillar
On 4 Apr 2019, at 17:30, Lifepillar  wrote:
> 
>> You have the option to ignore the error, though, in which case you get +Inf:
> 
> sqlite> delete from decTraps where flag = 'Division by zero';
> sqlite> select decStr(decDiv(1,0));
> Infinity

Forgot to mention that in this case a flag is silently added to another virtual 
table called decStatus:

sqlite> select * from decStatus;
Division by zero

so you may still query for errors. You have to manually clear the status then:

delete from decStatus; -- Clear the status flags

Life.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread Lifepillar
On 4 Apr 2019, at 17:15, James K. Lowden  wrote:
> 
> On Wed, 3 Apr 2019 14:30:52 +0200
> Lifepillar  wrote:
> 
> 
>> SQLite3 Decimal is an extension implementing exact decimal arithmetic
>> for SQLite3. It is currently unfinished and under development.  
> ...
>> I welcome any feedback, from the super-technical to the
>> end-user oriented. There is no manual so far, but the code is mostly
>> documented. 
> 
> What does divide-by-zero yield?  

By default:

sqlite> select decDiv(1,0);
Error: Division by zero

You have the option to ignore the error, though, in which case you get +Inf:

sqlite> delete from decTraps where flag = 'Division by zero';
sqlite> select decStr(decDiv(1,0));
Infinity

decTraps is a virtual table containing a list of flags which, when set by some 
function, raise an error. The flags are from IEEE 754.

> If NULL, no amount of exactitude will matter.  If the library is based
> on math, on the other hand, that would be a boon to SQLite users.  

NULLs are avoided where other results make sense. For instance:

sqlite> create table T (n blob);
sqlite> select decStr(decSum(n)) from T; -- Sum of an empty set of values
0
sqlite> select decStr(decAvg(n)) from T; -- Avg. of empty set of values
NaN

Life.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread Joshua Wise
> On the other hand, what table has a floating point number in its key?  
> 
> How do you even express the value of such a key for an exact match?  

Well I imagine it can be very useful for range queries. Imagine Julian dates, 
coordinate points, rankings, etc.

I suppose in the most common case, parsing the on-disk format to IEEE would 
only be necessary on a small number of rows, after the range query has already 
used memcmp() to find the rows it cares about. But yeah, in the case of bulk 
loading it’s probably only a loss.

> On Apr 4, 2019, at 11:15 AM, James K. Lowden  wrote:
> 
> On Wed, 3 Apr 2019 17:29:47 -0400
> Richard Hipp  wrote:
> 
>> On 4/3/19, Joshua Wise  wrote:
>>> From my naive understanding, memcmp() is used to efficiently
>>> compare long strings of bytes. But where in SQLite3 is it necessary
>>> to compare long strings of floating point numbers? I, of course,
>>> can imagine SQL queries plucking single floating point values from
>>> rows or indexes, but I can?t imagine where the long strings would
>>> be. Could you enlighten me?
>> 
>> Comparing keys in a btree search uses a lot of CPU cycles.  If the
>> comparison can be done using memcmp() rather than some custom
>> function, the comparison goes much faster, which makes searching
>> btrees faster.
> 
> On the other hand, what table has a floating point number in its key?  
> 
> How do you even express the value of such a key for an exact match?  
> 
> There is also a significant cost of converting to and from IEEE
> format.  In my experience bulk-loading quantitative databases, I
> encountered many occasions in which parsing the input accounted for 50%
> of the computation.  The limit wasn't network bandwidth or server
> speed, it was data conversion.  
> 
> --jkl
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread James K. Lowden
On Wed, 3 Apr 2019 14:30:52 +0200
Lifepillar  wrote:


> SQLite3 Decimal is an extension implementing exact decimal arithmetic
> for SQLite3. It is currently unfinished and under development.  
...
> I welcome any feedback, from the super-technical to the
> end-user oriented. There is no manual so far, but the code is mostly
> documented. 

What does divide-by-zero yield?  

If NULL, no amount of exactitude will matter.  If the library is based
on math, on the other hand, that would be a boon to SQLite users.  

--jkl


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread James K. Lowden
On Wed, 3 Apr 2019 11:37:47 -0600
Warren Young  wrote:

> Put another way, your defaults are already so large that no
> conceivable physical entity could build a computer big enough to
> simultaneously contain every distinct state your data type represents.

Exactly (as it were).  

Physical entities cannot be measured to more than 6 orders of
magnitude.  Finer than that, error takes over.  

What is the distance from New York to Los Angeles?  

We know it's 3,944 km.  Do we know it's 3,944,000 meters?  3,944,000,000
mm? We do not.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread James K. Lowden
On Wed, 3 Apr 2019 17:29:47 -0400
Richard Hipp  wrote:

> On 4/3/19, Joshua Wise  wrote:
> > From my naive understanding, memcmp() is used to efficiently
> > compare long strings of bytes. But where in SQLite3 is it necessary
> > to compare long strings of floating point numbers? I, of course,
> > can imagine SQL queries plucking single floating point values from
> > rows or indexes, but I can?t imagine where the long strings would
> > be. Could you enlighten me?
> 
> Comparing keys in a btree search uses a lot of CPU cycles.  If the
> comparison can be done using memcmp() rather than some custom
> function, the comparison goes much faster, which makes searching
> btrees faster.

On the other hand, what table has a floating point number in its key?  

How do you even express the value of such a key for an exact match?  

There is also a significant cost of converting to and from IEEE
format.  In my experience bulk-loading quantitative databases, I
encountered many occasions in which parsing the input accounted for 50%
of the computation.  The limit wasn't network bandwidth or server
speed, it was data conversion.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread Lifepillar
On 4 Apr 2019, at 10:37, Thomas Kurz  wrote:
> 
> I appreciate your effort towards this extension. In my opinion, however, this 
> is (along with bigint-support) a feature that belongs into core (for that 
> reason alone to get math operations, comparisons, aggregates, etc. working in 
> an intuitive way).
> 
> Years ago, for SQLite4, there seem to have been plans for "decimal math". The 
> concept looked very useful and I'd suggest taking that idea up again, but 
> this time for SQLite3 ;-)

You raise an interesting point. From a purely technical perspective, you are 
absolutely right: ui and performance would only benefit from full integration 
into SQLite. But an extension may be maintained independent of the main 
project’s goals, constraints, people and directions. Patching SQLite requires, 
IMO, coordination with SQLite’s developers. AFAIK, SQLite4 is a dead end: is 
there still any interest, or maybe a plan, for “decimal math” in the core?

Life.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-04 Thread Thomas Kurz
I appreciate your effort towards this extension. In my opinion, however, this 
is (along with bigint-support) a feature that belongs into core (for that 
reason alone to get math operations, comparisons, aggregates, etc. working in 
an intuitive way).

Years ago, for SQLite4, there seem to have been plans for "decimal math". The 
concept looked very useful and I'd suggest taking that idea up again, but this 
time for SQLite3 ;-)




- Original Message - 
From: Lifepillar 
To: sqlite-users@mailinglists.sqlite.org 
Sent: Wednesday, April 3, 2019, 14:30:52
Subject: [sqlite] ANN: SQLite3 Decimal Extension

[I hope that this kind of announcement is not off-topic here]

SQLite3 Decimal is an extension implementing exact decimal arithmetic
for SQLite3. It is currently unfinished and under development.  At this
point anything, including the user interface and the internals, is
subject to change. I am publishing it early because I am seeking to get
as much feedback as possible to "get it right".

The extension is currently using the excellent decNumber library, but it
does not aim to become an IEEE 754 conforming implementation. It is also
totally unrelated to SQL decimal/numeric types.

Decimals are stored as blobs and the storage format is a minor variant
of decimalInfinite (https://arxiv.org/abs/1506.01598). That is a totally
ordered encoding, so decimals can be compared directly (memcmp()) and
also indexed (does SQLite support indexes on blobs?).

[Here, I must thank Dr. Hipp, with whom I had a brief email exchange
severals moons ago, who convinced me that the IEEE 754 encoding was not
an ideal storage format for databases]

By default, the precision is limited to 39 digits and exponents must be
in the range [-99,999,999,+99,999,999] (for some mathematical
operations, the exponent must not exceed 99,999 in absolute value). Such
parameters may be configured at compile time and also changed at
runtime. Any integer or fractional number satisfying such requirements
can be manipulated. Note that the on-disk representation is *not*
subject to such limits and can accommodate arbitrarily small or
arbitrarily large decimals. With the defaults just mentioned, a decimal
occupies between 1 and 24 bytes on disk (plus any overhead that blobs
may add).

As I have said, I welcome any feedback, from the super-technical to the
end-user oriented. There is no manual so far, but the code is mostly
documented. You may find a sample session in the repository's home page.

Also, this is my first public Fossil repository: if you have any
suggestions on how I should improve its configuration, let me know.

So, here for the adventurous ones:

Official repository:

https://chiselapp.com/user/lifepillar/repository/sqlite3decimal

Git mirror (which exists only for testing `fossil git export`...):

https://github.com/lifepillar/sqlite3decimal-mirror

Enjoy,
Life.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Joshua Wise
Oh of course, that makes sense. I suppose that means querying on REAL indexes 
should be slower than querying on INTEGER indexes, in the current SQLite3 
implementation? Has a benchmark of this ever been done?


> On Apr 3, 2019, at 5:29 PM, Richard Hipp  wrote:
> 
> On 4/3/19, Joshua Wise  wrote:
>> From my naive understanding, memcmp() is used to efficiently compare long
>> strings of bytes. But where in SQLite3 is it necessary to compare long
>> strings of floating point numbers? I, of course, can imagine SQL queries
>> plucking single floating point values from rows or indexes, but I can’t
>> imagine where the long strings would be. Could you enlighten me?
> 
> Comparing keys in a btree search uses a lot of CPU cycles.  If the
> comparison can be done using memcmp() rather than some custom
> function, the comparison goes much faster, which makes searching
> btrees faster.
> 
> -- 
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Richard Hipp
On 4/3/19, Joshua Wise  wrote:
> From my naive understanding, memcmp() is used to efficiently compare long
> strings of bytes. But where in SQLite3 is it necessary to compare long
> strings of floating point numbers? I, of course, can imagine SQL queries
> plucking single floating point values from rows or indexes, but I can’t
> imagine where the long strings would be. Could you enlighten me?

Comparing keys in a btree search uses a lot of CPU cycles.  If the
comparison can be done using memcmp() rather than some custom
function, the comparison goes much faster, which makes searching
btrees faster.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Joshua Wise
From my naive understanding, memcmp() is used to efficiently compare long 
strings of bytes. But where in SQLite3 is it necessary to compare long strings 
of floating point numbers? I, of course, can imagine SQL queries plucking 
single floating point values from rows or indexes, but I can’t imagine where 
the long strings would be. Could you enlighten me?


> On Apr 3, 2019, at 3:23 PM, Lifepillar  wrote:
> 
> 
> 
>> On 3 Apr 2019, at 20:04, Joshua Thomas Wise  
>> wrote:
>> 
>>> [Here, I must thank Dr. Hipp, with whom I had a brief email exchange
>>> severals moons ago, who convinced me that the IEEE 754 encoding was not
>>> an ideal storage format for databases]
>> 
>> I’m curious, what were the reasons behind Dr. Hipp’s opinion on this?
> 
> At the time of SQLite4, I wrote to him asking why he had dismissed IEEE
> 754 as a storage format in favor of a custom encoding. His answer was
> that he wanted comparisons to be performed using memcmp(), which IEEE
> 754 does not allow. 
> 
> There may have been other reasons (complexity, range, ...), but that one
> stuck with me and prompted me to start searching for order-preserving
> encodings.
> 
> Life.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Lifepillar
On 3 Apr 2019, at 19:37, Warren Young  wrote:
> 
> On Apr 3, 2019, at 6:30 AM, Lifepillar  wrote:
>> 
>> By default, the precision is limited to 39 digits and exponents must be
>> in the range [-99,999,999,+99,999,999] (for some mathematical
>> operations, the exponent must not exceed 99,999 in absolute value).
> 
> That’s enough. :)

Should you need more, the storage format would not need to be changed!

>> Also, this is my first public Fossil repository: if you have any
>> suggestions on how I should improve its configuration, let me know.
> 
> The CRLF line endings in the decNumber directory can cause problems if you 
> ever cause a merge conflict.
> 
> Fossil has no problems doing merges on arbitrary data, but when it writes out 
> the conflict-resolution files to help the user manually fix a merge conflict, 
> it does so with the assumption that you’re using LF-only line endings, 
> causing stray CRs in the output.

Thanks, I did not know that.

> Unless you really need CRLF for some reason, it’s best to convert those line 
> endings.  That’ll also avoid the need to set the Fossil crlf-glob setting.

So far I have left line endings as I have found them, but I will
proceed to convert them following your advice.


>> Git mirror (which exists only for testing `fossil git export`...):
>> 
>>   https://github.com/lifepillar/sqlite3decimal-mirror
> 
> I wouldn’t say “only.”  If nothing else, it’s advertising for your project, 
> and it’ll make it more likely to show up in web searches.

Actually, I aimed at targeted advertising in this mailing list :) There
was some discussion about big numbers in the past, so I thought my
project might be interesting for some people here.

And perhaps I should have added that I welcome feedback at Chisel
(tickets for now: I haven't configured the permissions for the forum
yet), if you wish to discuss the project in more detail.

Life.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Lifepillar


> On 3 Apr 2019, at 20:04, Joshua Thomas Wise  
> wrote:
> 
>> [Here, I must thank Dr. Hipp, with whom I had a brief email exchange
>> severals moons ago, who convinced me that the IEEE 754 encoding was not
>> an ideal storage format for databases]
> 
> I’m curious, what were the reasons behind Dr. Hipp’s opinion on this?

At the time of SQLite4, I wrote to him asking why he had dismissed IEEE
754 as a storage format in favor of a custom encoding. His answer was
that he wanted comparisons to be performed using memcmp(), which IEEE
754 does not allow. 

There may have been other reasons (complexity, range, ...), but that one
stuck with me and prompted me to start searching for order-preserving
encodings.

Life.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Joshua Thomas Wise
> [Here, I must thank Dr. Hipp, with whom I had a brief email exchange
> severals moons ago, who convinced me that the IEEE 754 encoding was not
> an ideal storage format for databases]


I’m curious, what were the reasons behind Dr. Hipp’s opinion on this?

> On Apr 3, 2019, at 1:56 PM, Simon Slavin  wrote:
> 
> On 3 Apr 2019, at 6:51pm, Warren Young  wrote:
> 
>> On Apr 3, 2019, at 6:30 AM, Lifepillar  wrote:
>> 
>>> does SQLite support indexes on blobs?
>> 
>> It claims to:
> 
> Indeed.  Be careful to verify whether, from the perspective of your 
> programming language, it considers the first or the last byte to be most 
> significant.  I've seen people caught out by a similar issue.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Simon Slavin
On 3 Apr 2019, at 6:51pm, Warren Young  wrote:

> On Apr 3, 2019, at 6:30 AM, Lifepillar  wrote:
> 
>> does SQLite support indexes on blobs?
> 
> It claims to:

Indeed.  Be careful to verify whether, from the perspective of your programming 
language, it considers the first or the last byte to be most significant.  I've 
seen people caught out by a similar issue.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Warren Young
On Apr 3, 2019, at 6:30 AM, Lifepillar  wrote:
> 
> does SQLite support indexes on blobs?

It claims to:

$ sqlite3 x.db
SQLite version 3.26.0 2018-12-01 12:34:55
Enter ".help" for usage hints.
sqlite> create table x (a blob);
sqlite> create index xi on x(a);
sqlite> explain query plan select a from x where a=5;
QUERY PLAN
`--SEARCH TABLE x USING COVERING INDEX xi (a=?)
sqlite> 

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Simon Slavin
On 3 Apr 2019, at 6:37pm, Warren Young  wrote:

> I once calculated that 30 digits was enough to give a precise location to 
> every particle the size of a grain of sand or larger in the known universe.  
> You just need to define as many columns in your database as is needed for 
> your physics: 3 for normal 3-space, 4 for spacetime, 10, 11, or 26 for string 
> theory…

The first 39 digits of pi allow you to do maths on the size of the universe in 
units the width of a hydrogen atom.  Let's add one extra digit to deal with 
rounding errors.  Pi may have more than 40 digits, but the others have no 
practical use in this universe.

(Yes, you might consider them to be theoretically useful for things like number 
theory or testing computer components which do certain kinds of maths.)
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Warren Young
On Apr 3, 2019, at 7:05 AM, Dominique Devienne  wrote:
> 
> Seems like you're using non-ascii chars in your source code, notably
> the #pragma mark.

I think you’ve got two nits there, not one.

As for the non-ASCII characters, they’re UTF-8, which is the de facto standard 
character set on the Internet since around the time of The Bubble.  Ignoring 
the embedded world, I can’t think of an in-support OS that doesn’t have 
built-in support for UTF-8.  The only place I’d caution against using such 
characters is in printf() output and such, and then only because the Windows 
Console defaults to UTF-16LE.

As for the nonstandard #pragmas, ISO C99 has you covered there.  Thus saith 
§6.10.6 of #pragmas not of the STDC form: ”Any such pragma that is not 
recognized by the implementation is ignored.”  In other words, if a compiler 
that doesn’t understand #pragma mark complains about it or refuses to compile 
code using it, it’s non-conforming.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Warren Young
On Apr 3, 2019, at 6:30 AM, Lifepillar  wrote:
> 
> By default, the precision is limited to 39 digits and exponents must be
> in the range [-99,999,999,+99,999,999] (for some mathematical
> operations, the exponent must not exceed 99,999 in absolute value).

That’s enough. :)

I once calculated that 30 digits was enough to give a precise location to every 
particle the size of a grain of sand or larger in the known universe.  You just 
need to define as many columns in your database as is needed for your physics: 
3 for normal 3-space, 4 for spacetime, 10, 11, or 26 for string theory…

9 extra digits gets you down to the “large molecule” scale.

5+ digits worth of negative exponents is enough to get down to the sub-atomic 
scale.

Put another way, your defaults are already so large that no conceivable 
physical entity could build a computer big enough to simultaneously contain 
every distinct state your data type represents.  It’ll do. :)

I did the original calculation because someone was complaining that MySQL’s 
default — at the time — of 30 digits to the left of the decimal point wasn’t 
enough for their application.  MySQL’s decimal type then allows another 30 
digits to the *right* of the decimal point.  The original poster in that thread 
remained silent when challenged to divulge what application required more 
precision than that.

(The current MySQL limit default for the DECIMAL type is 65 total digits, with 
a configurable number of digits the right of the decimal, up to 30.)

> Also, this is my first public Fossil repository: if you have any
> suggestions on how I should improve its configuration, let me know.

The CRLF line endings in the decNumber directory can cause problems if you ever 
cause a merge conflict.

Fossil has no problems doing merges on arbitrary data, but when it writes out 
the conflict-resolution files to help the user manually fix a merge conflict, 
it does so with the assumption that you’re using LF-only line endings, causing 
stray CRs in the output.

Unless you really need CRLF for some reason, it’s best to convert those line 
endings.  That’ll also avoid the need to set the Fossil crlf-glob setting.

> Git mirror (which exists only for testing `fossil git export`...):
> 
>https://github.com/lifepillar/sqlite3decimal-mirror

I wouldn’t say “only.”  If nothing else, it’s advertising for your project, and 
it’ll make it more likely to show up in web searches.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Lifepillar
> > SQLite3 Decimal is an extension implementing exact decimal arithmetic 
> > for SQLite3. It is currently unfinished and under development. 
> 
> 
> I'm curious, what was your motivation for doing this? 
> Use cases envisioned for its use? 

Mainly financial applications. Beancount’s author
(http://furius.ca/beancount/) has a series of documents arguing convincingly
about the need for a library like this.

> > Decimals are stored as blobs and the storage format is a minor variant 
> > of decimalInfinite (https://arxiv.org/abs/1506.01598). 
> 
> Does it use some kind of magic cookie? 

No.

> How can you distinguish your encoding from an arbitrary blob? 

I don't. In fact, you may pass any blob, but if it doesn’t have the right
format, it results in a decoding error (which, currently, may not be very
well tolerated by the library-I have to check that).

> Unfortunately, SQLite does not support UDTs, but it did grow in recent 
> months 
> the https://www.sqlite.org/c3ref/value_subtype.html API, which plays a 
> poor-man's 
> substitute when call chaining. 

Thanks, I didn’t know about subtypes. I’ll read about them, although, at a
quick glance, I am not sure how they might be used to enforce type
checking in my case.

> That OTOH has no influence on how the value will be *stored in tables*, 
> unfortunately, so you cannot have alternate representation of the value, 
> one optimized for storage, another for processing (in call chains), so maybe 
> it's all moot, mentioning this. 
> 
> That is a totally 
> > ordered encoding, so decimals can be compared directly (memcmp()) and 
> > also indexed (does SQLite support indexes on blobs?). 
> > 
> 
> Yes it does. 

Great!

> > As I have said, I welcome any feedback, from the super-technical to the 
> > end-user oriented. There is no manual so far, but the code is mostly 
> > documented. You may find a sample session in the repository's home page. 
> > 
> 
> From an End-User POV, the call-chained-functions are "heavy looking”.

Yes, I agree. Keep in mind that the current interface is most likely bound
to be changed.

> So maybe a printf-like API that parses a now-looking math expression 
> and replaces its placeholders with the var-arg supplied arguments to the 
> function 
> would make the "SQL" much more readable. 
> 
> Since you cannot extend the regular arithmetic expressions in SQLite to act 
> on your UDTs 
> differently than the built-in processing, that would be the next best thing 
> IMHO. 

I will think about that!

> All in all, it looks very well done. You're obviously a very competent 
> programmer IMHO :). 

It looks so, because I stand on the shoulder of giants ;)

> Thanks, --DD 
> 
> PS: Seems like you're using non-ascii chars in your source code, notably 
> the #pragma mark. 
>   That's pretty, but asking for trouble IMHO, and likely to cause issues 
> for cross-platform support. 

Yes, that and a few other things need to be polished. My initial
idea was to keep this project for myself, so I took a liberal approach
style-wise. But it has grown more than I expected...

Thanks for the feedback,
Life.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ANN: SQLite3 Decimal Extension

2019-04-03 Thread Dominique Devienne
On Wed, Apr 3, 2019 at 2:31 PM Lifepillar  wrote:

> [I hope that this kind of announcement is not off-topic here]
>

Not at all, IMHO.


> SQLite3 Decimal is an extension implementing exact decimal arithmetic
> for SQLite3. It is currently unfinished and under development.


I'm curious, what was your motivation for doing this?
Use cases envisioned for its use?


> Decimals are stored as blobs and the storage format is a minor variant
> of decimalInfinite (https://arxiv.org/abs/1506.01598).


Does it use some kind of magic cookie?
How can you distinguish your encoding from an arbitrary blob?

Unfortunately, SQLite does not support UDTs, but it did grow in recent
months
the https://www.sqlite.org/c3ref/value_subtype.html API, which plays a
poor-man's
substitute when call chaining.

That OTOH has no influence on how the value will be *stored in tables*,
unfortunately, so you cannot have alternate representation of the value,
one optimized for storage, another for processing (in call chains), so maybe
it's all moot, mentioning this.

That is a totally
> ordered encoding, so decimals can be compared directly (memcmp()) and
> also indexed (does SQLite support indexes on blobs?).
>

Yes it does.


> As I have said, I welcome any feedback, from the super-technical to the
> end-user oriented. There is no manual so far, but the code is mostly
> documented. You may find a sample session in the repository's home page.
>

From an End-User POV, the call-chained-functions are "heavy looking".
So maybe a printf-like API that parses a now-looking math expression
and replaces its placeholders with the var-arg supplied arguments to the
function
would make the "SQL" much more readable.

Since you cannot extend the regular arithmetic expressions in SQLite to act
on your UDTs
differently than the built-in processing, that would be the next best thing
IMHO.

All in all, it looks very well done. You're obviously a very competent
programmer IMHO :).

Thanks, --DD

PS: Seems like you're using non-ascii chars in your source code, notably
the #pragma mark.
  That's pretty, but asking for trouble IMHO, and likely to cause issues
for cross-platform support.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users