Re: [sqlite] What does "The use of the amalgamation is recommended for all applications." mean, precisely?

2012-05-25 Thread Sidney Cadot
Hi,

> I approve of your translation.

That interpretation concerns the relative merits of using the separate
sources vs the amalgamation, if I understand correctly. Barring
special circumstances, the use of the amalgamation is the preferred
way; that is understood.

What I was wondering is whether the SQLite docs (written by you, I
presume) express a preference for using SQLite3 via inclusion of the
source (amalgamation) into ones project, vs. using a pre-compiled
library (as could be provided, for example, by a linux distribution).

Some readings of the documentation suggest that SQLite advocates
direct inclusion over using the software as a library:

   "The amalgamation contains everything you need to integrate
SQLite into a larger project. Just copy the amalgamation into your
source directory and compile it along with the other C code files in
your project. " - http://www.sqlite.org/amalgamation.html

   "The use of the amalgamation is recommended for all
applications." - http://www.sqlite.org/howtocompile.html

I wonder if that is the case. I think that using SQLite3 via library
and via source are both actively supported ways of using the library,
as evidenced by the distribution of both the amalgamated source and a
tarbal where a configure/make/make install will yield a library. The
choice should be left to the user, depending on their circumstances.

I am wondering where you stand on this.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What does "The use of the amalgamation is recommended for all applications." mean, precisely?

2012-05-24 Thread Sidney Cadot
> However, the OP has indicated little that would allow us to guess whether
> his project should follow the norm or not.

I think my question is independent of my particular project; in fact,
I am not working on an SQLite project at the moment.

The reason I asked this question is that I have a discussion about
what this particular phrase means, with a friend and fellow
programmer. The two interpretations I propose reflect our different
readings of that particular statement.

I do feel that a statement that is on a generic help page should be
unambiguous, and not allow different interpretations. Is it known who
wrote this particular page, and does he perhaps follow this mailing
list? I'd be curious to ask him about the intended meaning. Perhaps
the wording as given could be made more explicit.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What does "The use of the amalgamation is recommended for all applications." mean, precisely?

2012-05-24 Thread Sidney Cadot
> Why risk ending up with an unexpected (possibly old) version
> by linking at runtime just to save users less than 300K of disk
> space?

But that's an argument against shared linking in general.

I am just curious what idea this particular statement on this
particular help-page (specific to SQLite) is trying to convey.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What does "The use of the amalgamation is recommended for all applications." mean, precisely?

2012-05-24 Thread Sidney Cadot
> Yes.  SQLite is so small there's really no reason to make a separate library 
> of it.

Well, my Linux distribution may provide a "libsqlite3-dev" package,
which makes linking to a recent version of sqlite as simple as adding
LDLIBS=-lsqlite3 to the Makefile. By going that path you ensure that
re-making the package will link against newer versions of sqlite as
they come available, at zero effort. That is a reason.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] What does "The use of the amalgamation is recommended for all applications." mean, precisely?

2012-05-24 Thread Sidney Cadot
Dear all,

On the support page http://www.sqlite.org/howtocompile.html, it says:

"The use of the amalgamation is recommended for all applications."

Is this a general recommendation, to use the amalgamated source file
as the preferred way of including SQLite functionality in one's
application, rather than using a separately compiled library?

Or should I read this as a recommendation just in case I need to
compile SQLite from source, and need to decide between using the
amalgamation or using the individual source files?


Kind regards,
  Sidney
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-07 Thread Sidney Cadot
Hi Jay

>  No, this is basic SQL order of operations.

You are right, that first approach I tried was definitely a brainfart.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-07 Thread Sidney Cadot
> Please someone show me what is the correct value of this avg() in practice.

There are a number of answers to this.

To the level of precision that you specified, all answer are
completely fine; the error is, in all cases, very small relative to
the variance of your input data.

It is an interesting exercise, though. It would be a good idea to sort
the input data on absolute magnitude:

SELECT AVG(data) FROM TryAvg ORDER BY ABS(data) DESC;

But apparently, SQLite ignores the ORDER BY clause because of the
AVG() function, perhaps -erroneously- assuming AVG() is commutative.

This works, but isn't guaranteed to work:

SELECT AVG(data) FROM (SELEFT data FROM TryAvg ORDER BY ABS(data) DESC);

Funnily enough, in Postgres, the test seems to expose an actual bug,
or at least a strange error message:

sidney=# select AVG(data) from TryAvg ORDER BY data DESC;
ERROR:  column "tryavg.data" must appear in the GROUP BY clause or be
used in an aggregate function
LINE 1: select AVG(data) from TryAvg ORDER BY data DESC;
  ^


Now, to address your actual point. The fact that we cannot control
evaluation order in SQL does mean that one has to be wary in case
one's data is badly conditioned. The problem is in SQL: it assumes
commutativity for aggregate functions, and that is a property that no
floating point format conceived can deliver. This is an interesting
fact (as is the fact that, as a consequence of this, identical
invocations of aggregate functions can yield different results under
seemingly identical circumstances in SQL), but I disagree that this
disqualifies using SQL for use in a scientific setting. There are
preciously little types of measurements where one has to calculate
with 16 orders of magnitude as your example does, and in such
circumstances, the scientist needs to be exceedingly on his or her
guard to know what she is doing in ANY language, with ANY storage
mechanism. I think therefore that this example, interesting as it is,
does not provide a conclusive argument in the discussion at hand.

-S
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-07 Thread Sidney Cadot
Hi Jay,

> One should never assume a database uses IEEE 754, so one should never
> assume it uses similar semantics.

One should not assume it unless it is documented, of course. Postgres, for 
example, half-heartedly embraces IEEE-754 'on platforms that use it' (see 
section 8.1.3 of its manual). It documents the fact that +/- infinity and NaN 
are useable on such systems.

> Even those databases that do use
> IEEE 754 for a select few of their types have other considerations.
> In the bigger picture, IEEE 754 makes up, at most, a small part of
> the SQL numeric environment.

For SQL: yes. For SQLite though, it is the only game in town.

> Using 754 as a reference for the rest of the environment strikes me as
> poorly thought out and putting the tail before the dog.

In terms of generic SQL you may be right (although I'd be willing to argue for 
it). However, I think that for a specific DB product, it is a good thing to 
document without ambiguity what the properties and guarantees of the numeric 
types and operations are; and IEEE-754 is the only game in town when it comes 
to properly specified floating point numbers.

I feel this is especially true for the light-weight database system that is 
SQLite. I get the impression that you are advocating to keep floating point 
operations deliberately vague and underspecified (please correct me if I am 
wrong). To me as a developer that is useless; I will never be able to reason 
about the correctness of anything, and I am effectively dependent on the 
(undocumented) effort that the makers of the FP implementation did. 
Effectively, that would be a return to the pre-IEEE 754 wild west of floating 
point calculations.

> [...] Its express purpose was to allow non-technical people to write
> queries and build business applications.

That may have been the optimistic idea 40 years ago, but I think it is time to 
admit that this was completely misguided. If 40 years of relational database 
experience has taught us anything, it is that doing proper SQL (anything beyond 
a basic SELECT) is an actual skill that requires technical prowess.

> [...] This is what most high-level scripting languages like Perl and Python 
> do.

Perl and Python support NaNs and infinities just fine.

> If you want bare metal IEEE 754 for your scientific computing
> application, then you might want to rethink doing your math operations
> in a data storage system.

You are making it sound as if proper support for IEEE-754 types would open up 
some can of worms for regular users, but I really don't see why you think that 
is the case. They would see an occasional "NaN" instead of NULL if they did 
something naughty; I personally think that is a lot more informative.

Compare currently:

sqlite> SELECT 0.0/0.0, 1.0/0.0;
|
sqlite> 

... versus what I would like to see:

sqlite> SELECT 0.0/0.0, 1.0/0.0;
NaN|Inf
sqlite> 

> As you've pointed out, SQLite is more than capable of storing and retrieving 
> non-numeric IEEE 754 values

No, it doesn't support storing and retrieving NaNs. That is an arbitrary limit 
that bites those of us who actually know what they are doing.

Sidney


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-07 Thread Sidney Cadot

On Jun 6, 2011, at 21:55, Jean-Christophe Deschamps wrote:

> You have a DOUBLE column where you need to store NaN?  Go ahead and 
> store 'NaN' in offending rows.

This cannot be done. They will be turned into NULL.

Sidney

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-06 Thread Sidney Cadot
>> > You have a DOUBLE column where you need to store NaN? ?Go ahead and
>> > store 'NaN' in offending rows.
>>
>> You mean, as a string?
>
>  No, by binding the raw value using the C interfaces as any
>  respectable program would do.

But then I'd lose the ability to actually use those values in
computations. Also, I'm not particularly fond of SQLite's
heterogeneous columns, so I'd like to avoid them. Especially when a
much better solution is staring us in the face.

>  Yes, well, it's been pretty clearly determined that SQLite doesn't
>  deal with floating point numbers for your definition of "as it should."

Great, I am glad that my point got across.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-06 Thread Sidney Cadot
> Given that there are
> many, many SQLite3 applications, it is really not possible to say,
> with a straight face anyways, that no applications would break.

That is true. I would certainly not advocate changing the default behavior.

However the 'once we make a mistake, we can't fix it' idea cannot be
maintained eternially, and I presume there will have to come a day
where sqlite will have a compatibility-breaking upgrade (sililar to
Python2 -> Python3) ... Sqlite has accumulated a handful of warts over
the years, and that would be a good time to fix them.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-06 Thread Sidney Cadot
On Mon, Jun 6, 2011 at 9:55 PM, Jean-Christophe Deschamps
 wrote:

> You have a DOUBLE column where you need to store NaN?  Go ahead and
> store 'NaN' in offending rows.

You mean, as a string? That's rather a dirty hack. Also, it doesn't
work as it should:


sqlite> SELECT 1.0 + 'NaN';
1.0
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-06 Thread Sidney Cadot
Hi Jay,

> One should never assume a database uses IEEE 754, so one should never
> assume it uses similar semantics.

One should not assume it unless it is documented, of course. Postgres,
for example, half-heartedly embraces IEEE-754 'on platforms that use
it' (see section 8.1.3 of its manual). It documents the fact that +/-
infinity and NaN are useable on such systems.

> Even those databases that do use
> IEEE 754 for a select few of their types have other considerations.
> In the bigger picture, IEEE 754 makes up, at most, a small part of
> the SQL numeric environment.

For SQL: yes. For SQLite though, it is the only option.

> Using 754 as a reference for the rest of the environment strikes me as
> poorly thought out and putting the tail before the dog.

In terms of generic SQL you may be right (although I'd be willing to
argue for it). However, I think that for a specific DB product, it is
a good thing to document without ambiguity what the properties and
guarantees of the numeric types and operations are; and IEEE-754 is
the only game in town when it comes to properly specified floating
point numbers.

I feel this is especially true for the light-weight database system
that is SQLite. I get the impression that you are advocating to keep
floating point operations deliberately vague and underspecified
(please correct me if I am wrong). To me as a developer that is
useless; I will never be able to reason about the correctness of
anything, and I am effectively dependent on the (undocumented) effort
that the makers of the FP implementation did. Effectively, that would
be a return to the pre-IEEE 754 wild west of floating point
calculations.

> [...] Its express purpose was to allow non-technical people to write
> queries and build business applications.

That may have been the optimistic idea 40 years ago, but I think it is
time to admit that this was completely misguided. If 40 years of
relational database experience has taught us anything, it is that
doing proper SQL (anything beyond a basic SELECT) is an actual skill
that requires technical prowess.

> [...] This is what most high-level scripting languages like Perl and Python 
> do.

Perl and Python support NaNs and infinities just fine.

> If you want bare metal IEEE 754 for your scientific computing
> application, then you might want to rethink doing your math operations
> in a data storage system.

You are making it sound as if proper support for IEEE-754 types would
open up some can of worms for regular users, but I really don't see
why you think that is the case. They would see an occasional "NaN"
instead of NULL if they did something naughty; I personally think that
is a lot more informative.

Compare currently:

sqlite> SELECT 0.0/0.0, 1.0/0.0;
|
sqlite>

... versus what I would like to see:

sqlite> SELECT 0.0/0.0, 1.0/0.0;
NaN|Inf
sqlite>

> As you've pointed out, SQLite is more than capable of storing and retrieving 
> non-numeric IEEE 754 values

No, it doesn't support storing and retrieving NaNs. That is an
arbitrary limit that bites those of us who actually know what they are
doing.

Sidney
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-06 Thread Sidney Cadot
> Ah.  In that case, I /would/ argue that this is bad, and that SQLite should 
> conform to the standard.

That is true, although it is rather unfortunate that the standard
makes this statement, IMHO.

Unfortunately, this doesn't address the point of whether it should be
possible to use NaNs as floating point values, and to store them in
tables. As indicated previously, most databases allow it (at least all
modern ones seem to), and only SQLite makes a positive effort to
handle it as a NULL, introducing (IMHO) strange semantics. I have seen
no convincing argument being put up to defend that design choice, so
far.

Would it be useful to open a ticket on this issue, or will it never be
changed e.g. for fear of breaking backward compatibility?

In any case, I think it would be useful to spend a paragraph or two in
the documentation on the issue of floating point semantics.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-06 Thread Sidney Cadot
> It'd be OK for NaN to map to NULL, but not for infinity, since there
> is a distinction between positive and negative infinity, and that
> distinction is valuable.

The NaN value in IEEE-754 is also not unique. There is the distinction
between signaling and quiet NaNs, and furthermore mantissa bits can
(and sometimes are) used to carry a tag, conveying information as to
the source of a problem.

Also, I don't particularly see how it is ok for NaN to map to NULL;
they are conceptually quite different. If I may draw an analogy,
mapping NaN to NULL is pretty much the same as mapping the empty
string to NULL.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-05 Thread Sidney Cadot
Hi Simon,

> But you were using a SQL command to make the match.

Well, I was using it to demonstrate some behavior I observed, yes. I
was not "matching values with NULL". But whatever.

> You executed a SELECT command and got an answer from SQL.  That answer does 
> not mean
> "The result of the calculation '1.0 / 0.0' is the value NULL."

That's a bit of a philosophical point. In the context of SQLite, the
expression 1.0 / 0.0 evaluates to something that is representable,
namely NULL, that much is clear. Whether NULL is to be considered a
value is up for debate. It is a bit weird to be able to represent
"that what cannot be represented", so I'd take the IMHO simpler
interpretation of just accepting NULL as a value of any non NOT NULL
domain.

> because you asked SQL, not a maths library.  In this case, it's unknown.  
> Because SQLite doesn't know how to do that calculation.

Accepting that, my question becomes: why does SQLite elect to not know
what to do? The handling of NaN results is special cased in the code
(see http://www.sqlite.org/cvstrac/chngview?cn=5066). Why not just
accept the existence of NaN as a valid floating point value? Postgres,
Mysql, and (from a quick google) Oracle and DB2 do this. SQLite is
really the odd one out, here.

Sidney
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-05 Thread Sidney Cadot
Hi Simon,

> Jay is talking about SQL.  SQL /does/ use NULL for 'unknown'.

Well yes, it does, but my entire point is that floating point NaN is
quite different from "Unknown".

SQLite sort-of unifies NaN and NULL (although this isn't documented).
However, this is not an SQL choice -- it is an implementation choice.
PostgreSQL and MySQL, for example, handle NaN values as floating point
values, quite distinctly from NULL values, e.g. in Postgres:

# SELECT CAST('NaN' AS DOUBLE PRECISION);
 float8

NaN
(1 row)

> And the OP was trying to match values with NULL.

Actually, no, I was asking how SQlite behaves with respect to IEEE-754
floating point.

Sidney
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-05 Thread Sidney Cadot
Hi Simon,

> "Null is a special marker used in Structured Query Language (SQL) to indicate 
> that a data value does not exist in the database."

To me, this statement does not apply to "NaN", which is a perfectly
fine (albeit unusual) floating point value.

> If you compare anything with NULL, you will get NULL as a result, even if the 
> thing you're comparing isn't NULL.

This depends on how you compare:

SELECT NULL = NULL; --> NULL

SELECT NULL IS NULL; --> 1.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-05 Thread Sidney Cadot
Hi Jay,

>  However, it is worth remembering that IEEE 754 is really about building
>  processors, not about end-user interaction.  While it is a rigid,
>  formal specification of a numeric environment, at its heart it is
>  about mechanics, not about consistent mathematical systems built on
>  theorems and proofs.

While it is a bit off-topic, I disagree with this assessment. The 754
standard has been carefully crafted to allow rigorous statements about
the stability of numerical algorithms. It chooses deliberately to
sacrifice easy hardware implementation in favor of well-defined
semantics. In fact, it took quite some time before compliant hardware
implementations were available after its inception.

>  As others have pointed out, one of the meanings of NULL is essentially 
> "unknown."

Yes, but in terms of IEEE-754, there exist no "unknown" results. Any
of the elementary operations (+ - * /) has a fully bitwise predictable
result (subject to the rounding mode).

>> * SELECT 1.0 / 0.0 yields NULL (where I would expect to see Inf)
>
>  NULL is returned any time 0 or 0.0 is seen on the right side of a
>  divide expression.

Is that behavior mandated behavior by any of the SQL standards, or is
that an implementation choice by SQLite?

>  Given NULL to mean "unknown", this makes a lot more sense.  This is
>  a prime example of the difference between IEEE 754 and a "real world"
>  numeric environment.  Anyone that knows a bit of math isn't going to
>  expect 2.0/0.0 and 1.0/0.0 to yield different answers.

In IEEE-754, they don't. Both return +infinity. You may be thinking of
0.0 / 0.0, which does return NaN in IEEE-754.

For what it is worth: I know a bit of math, and I actually expect
floating point operations to follow the IEEE-754 mandated behavior
nowadays. Unless the SQL standard mandates specific behavior, of
course; in the context of SQLite, that would clearly take precedence
over IEEE-754. Unfortunately I do not know any of the SQL standards
nearly as well as IEEE-754.

As a matter of principle, I think it is not good practice to give
"this is what people expect in the real world" precedence over a
well-defined, rigorous standard. For one thing, it is quite debatable
what people expect in the real world. For another thing, rather smart
people have thought real hard to make IEEE-754 semantically
consistent; overriding such deliberations with gut feelings about how
things should behave doesn't sound like a good idea to me.

> But NULL makes a lot of sense in the proscribed environment, and is much more 
>consistent with the rest of SQL's operators.

That is a respectable position to take on the issue. However, it
limits the usefulness of SQLite for storing scientific data (which is
what I am trying to use it for, currently). Here, there is a clear and
useful distinction between a "missing value" and a NaN -- the latter
meaning "a calculation was done but it failed in any of the
standard-prescribed ways". The difference is subtle but real. Many,
probably most, languages used for scientific computation distinguish
between those two concepts.

Lastly: if SQLite chooses to unify the floating-point concept of NaN
and the SQL concept of NULL, I feel it should do the same for +/-
infinity. As it stands, it seems to incorporate one concept of
IEEE-754 (+/- infinity) while omitting the NaN. My personal feeling is
that a clear choice should be made. But more importantly: the behavior
needs to be documented. The SQLite docs are silent on the issue of
floating point behavior, as far as I can tell.


Sidney
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-05 Thread Sidney Cadot
> Note that according to SQL semantics, 'NULL' means 'I don't know'.

I am not quite sure what you are saying. I am pretty sure that NULL is
not defined so informally ... :)

> So every value of all types matches with it.

I don't understand what "matches with" means in this context, sorry.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Handling of IEEE-754 nan and +/-inf in SQLite?

2011-06-05 Thread Sidney Cadot
Hi all,

Is the way in which SQLite handlesNaN and Infinity values as defined
by IEEE-754 documented somewhere? I would also be interested to find a
discussion of the rationale behind the design decisions.

After some experimenting, it appears that ...

* SELECT 1.0 / 0.0 yields NULL (where I would expect to see Inf)
* SELECT 1e1 yields an actual IEEE-754 infinity, and it can be
stored in a table
* SELECT 1e1 + 1e1 yields Infinity, as expected according to IEEE-754.
* SELECT 1e1 - 1e1 yields NULL, where I would have expected to
see NaN in accordance with IEEE-754.

Regards, Sidney
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users