Re: Implicit Casts for Arithmetic Operators

2018-10-12 Thread Benedict Elliott Smith
I think we do, implicitly, support precision and scale - only dynamically.  The 
precision and scale are defined by the value on insertion, i.e. those necessary 
to represent it exactly.  During arithmetic operations we currently truncate to 
decimal128, but we can (and probably should) change this.   Ideally, we would 
support explicit precision/scale in the declared type, but our current 
behaviour is not inconsistent with introducing this later.

FTR, I wasn’t suggesting the spec required the most approximate type, but that 
the most consistent rule to describe this behaviour is that the approximate 
type always wins.  Somebody earlier justified this by the fact that one operand 
is already truncated to this level of approximation, so why would you want more 
accuracy in the result type?

I would be comfortable with either, fwiw, and they are both consistent with the 
spec.  It’s great if we can have a consistent idea behind why we do things 
though, so it seems at least worth briefly discussing this extra weirdness.





> On 12 Oct 2018, at 18:10, Ariel Weisberg  wrote:
> 
> Hi,
> 
> From reading the spec. Precision is always implementation defined. The spec 
> specifies scale in several cases, but never precision for any type or 
> operation (addition/subtraction, multiplication, division).
> 
> So we don't implement anything remotely approaching precision and scale in 
> CQL when it comes to numbers I think? So we aren't going to follow the spec 
> for scale. We are already pretty far down that road so I would leave it 
> alone. 
> 
> I don't think the spec is asking for the most approximate type. It's just 
> saying the result is approximate, and the precision is implementation 
> defined. We could return either float or double. I think if one of the 
> operands is a double we should return a double because clearly the schema 
> thought a double was required to represent that number. I would also be in 
> favor of returning a double all the time so that people can expect a 
> consistent type from expressions involving approximate numbers.
> 
> I am a big fan of widening for arithmetic expressions in a database to avoid 
> having to error on overflow. You can go to the trouble of only widening the 
> minimum amount, but I think it's simpler if we always widen to bigint and 
> double. This would be something the spec allows.
> 
> Definitely if we can make overflow not occur we should and the spec allows 
> that. We should also not return different types for the same operand types 
> just to work around overflow if we detect we need more precision.
> 
> Ariel
> On Fri, Oct 12, 2018, at 12:45 PM, Benedict Elliott Smith wrote:
>> If it’s in the SQL spec, I’m fairly convinced.  Thanks for digging this 
>> out (and Mike for getting some empirical examples).
>> 
>> We still have to decide on the approximate data type to return; right 
>> now, we have float+bigint=double, but float+int=float.  I think this is 
>> fairly inconsistent, and either the approximate type should always win, 
>> or we should always upgrade to double for mixed operands.
>> 
>> The quoted spec also suggests that decimal+float=float, and decimal
>> +double=double, whereas we currently have decimal+float=decimal, and 
>> decimal+double=decimal
>> 
>> If we’re going to go with an approximate operand implying an approximate 
>> result, I think we should do it consistently (and consistent with the 
>> SQL92 spec), and have the type of the approximate operand always be the 
>> return type.
>> 
>> This would still leave a decision for float+double, though.  The most 
>> consistent behaviour with that stated above would be to always take the 
>> most approximate type to return (i.e. float), but this would seem to me 
>> to be fairly unexpected for the user.
>> 
>> 
>>> On 12 Oct 2018, at 17:23, Ariel Weisberg  wrote:
>>> 
>>> Hi,
>>> 
>>> I agree with what's been said about expectations regarding expressions 
>>> involving floating point numbers. I think that if one of the inputs is 
>>> approximate then the result should be approximate.
>>> 
>>> One thing we could look at for inspiration is the SQL spec. Not to follow 
>>> dogmatically necessarily.
>>> 
>>> From the SQL 92 spec regarding assignment 
>>> http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt section 4.6:
>>> "
>>>Values of the data types NUMERIC, DECIMAL, INTEGER, SMALLINT,
>>>FLOAT, REAL, and DOUBLE PRECISION are numbers and are all mutually
>>>comparable and mutually assignable. If an assignment would result
>>>in a loss of the most significant digits, an exception condition
>>>is raised. If least significant digits are lost, implementation-
>>>defined rounding or truncating occurs with no exception condition
>>>being raised. The rules for arithmetic are generally governed by
>>>Subclause 6.12, "".
>>> "
>>> 
>>> Section 6.12 numeric value expressions:
>>> "
>>>1) If the data type of both operands 

Re: Implicit Casts for Arithmetic Operators

2018-10-12 Thread Ariel Weisberg
Hi,

>From reading the spec. Precision is always implementation defined. The spec 
>specifies scale in several cases, but never precision for any type or 
>operation (addition/subtraction, multiplication, division).

So we don't implement anything remotely approaching precision and scale in CQL 
when it comes to numbers I think? So we aren't going to follow the spec for 
scale. We are already pretty far down that road so I would leave it alone. 

I don't think the spec is asking for the most approximate type. It's just 
saying the result is approximate, and the precision is implementation defined. 
We could return either float or double. I think if one of the operands is a 
double we should return a double because clearly the schema thought a double 
was required to represent that number. I would also be in favor of returning a 
double all the time so that people can expect a consistent type from 
expressions involving approximate numbers.

I am a big fan of widening for arithmetic expressions in a database to avoid 
having to error on overflow. You can go to the trouble of only widening the 
minimum amount, but I think it's simpler if we always widen to bigint and 
double. This would be something the spec allows.

Definitely if we can make overflow not occur we should and the spec allows 
that. We should also not return different types for the same operand types just 
to work around overflow if we detect we need more precision.

Ariel
On Fri, Oct 12, 2018, at 12:45 PM, Benedict Elliott Smith wrote:
> If it’s in the SQL spec, I’m fairly convinced.  Thanks for digging this 
> out (and Mike for getting some empirical examples).
> 
> We still have to decide on the approximate data type to return; right 
> now, we have float+bigint=double, but float+int=float.  I think this is 
> fairly inconsistent, and either the approximate type should always win, 
> or we should always upgrade to double for mixed operands.
> 
> The quoted spec also suggests that decimal+float=float, and decimal
> +double=double, whereas we currently have decimal+float=decimal, and 
> decimal+double=decimal
> 
> If we’re going to go with an approximate operand implying an approximate 
> result, I think we should do it consistently (and consistent with the 
> SQL92 spec), and have the type of the approximate operand always be the 
> return type.
> 
> This would still leave a decision for float+double, though.  The most 
> consistent behaviour with that stated above would be to always take the 
> most approximate type to return (i.e. float), but this would seem to me 
> to be fairly unexpected for the user.
> 
> 
> > On 12 Oct 2018, at 17:23, Ariel Weisberg  wrote:
> > 
> > Hi,
> > 
> > I agree with what's been said about expectations regarding expressions 
> > involving floating point numbers. I think that if one of the inputs is 
> > approximate then the result should be approximate.
> > 
> > One thing we could look at for inspiration is the SQL spec. Not to follow 
> > dogmatically necessarily.
> > 
> > From the SQL 92 spec regarding assignment 
> > http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt section 4.6:
> > "
> > Values of the data types NUMERIC, DECIMAL, INTEGER, SMALLINT,
> > FLOAT, REAL, and DOUBLE PRECISION are numbers and are all mutually
> > comparable and mutually assignable. If an assignment would result
> > in a loss of the most significant digits, an exception condition
> > is raised. If least significant digits are lost, implementation-
> > defined rounding or truncating occurs with no exception condition
> > being raised. The rules for arithmetic are generally governed by
> > Subclause 6.12, "".
> > "
> > 
> > Section 6.12 numeric value expressions:
> > "
> > 1) If the data type of both operands of a dyadic arithmetic opera-
> >tor is exact numeric, then the data type of the result is exact
> >numeric, with precision and scale determined as follows:
> > ...
> > 2) If the data type of either operand of a dyadic arithmetic op-
> >erator is approximate numeric, then the data type of the re-
> >sult is approximate numeric. The precision of the result is
> >implementation-defined.
> > "
> > 
> > And this makes sense to me. I think we should only return an exact result 
> > if both of the inputs are exact.
> > 
> > I think we might want to look closely at the SQL spec and especially when 
> > the spec requires an error to be generated. Those are sometimes in the spec 
> > to prevent subtle paths to wrong answers. Any time we deviate from the spec 
> > we should be asking why is it in the spec and why are we deviating.
> > 
> > Another issue besides overflow handling is how we determine precision and 
> > scale for expressions involving two exact types.
> > 
> > Ariel
> > 
> > On Fri, Oct 12, 2018, at 11:51 AM, Michael Burman wrote:
> >> Hi,
> >> 
> >> I'm not sure if I would 

Re: Implicit Casts for Arithmetic Operators

2018-10-12 Thread Benedict Elliott Smith
If it’s in the SQL spec, I’m fairly convinced.  Thanks for digging this out 
(and Mike for getting some empirical examples).

We still have to decide on the approximate data type to return; right now, we 
have float+bigint=double, but float+int=float.  I think this is fairly 
inconsistent, and either the approximate type should always win, or we should 
always upgrade to double for mixed operands.

The quoted spec also suggests that decimal+float=float, and 
decimal+double=double, whereas we currently have decimal+float=decimal, and 
decimal+double=decimal

If we’re going to go with an approximate operand implying an approximate 
result, I think we should do it consistently (and consistent with the SQL92 
spec), and have the type of the approximate operand always be the return type.

This would still leave a decision for float+double, though.  The most 
consistent behaviour with that stated above would be to always take the most 
approximate type to return (i.e. float), but this would seem to me to be fairly 
unexpected for the user.


> On 12 Oct 2018, at 17:23, Ariel Weisberg  wrote:
> 
> Hi,
> 
> I agree with what's been said about expectations regarding expressions 
> involving floating point numbers. I think that if one of the inputs is 
> approximate then the result should be approximate.
> 
> One thing we could look at for inspiration is the SQL spec. Not to follow 
> dogmatically necessarily.
> 
> From the SQL 92 spec regarding assignment 
> http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt section 4.6:
> "
> Values of the data types NUMERIC, DECIMAL, INTEGER, SMALLINT,
> FLOAT, REAL, and DOUBLE PRECISION are numbers and are all mutually
> comparable and mutually assignable. If an assignment would result
> in a loss of the most significant digits, an exception condition
> is raised. If least significant digits are lost, implementation-
> defined rounding or truncating occurs with no exception condition
> being raised. The rules for arithmetic are generally governed by
> Subclause 6.12, "".
> "
> 
> Section 6.12 numeric value expressions:
> "
> 1) If the data type of both operands of a dyadic arithmetic opera-
>tor is exact numeric, then the data type of the result is exact
>numeric, with precision and scale determined as follows:
> ...
> 2) If the data type of either operand of a dyadic arithmetic op-
>erator is approximate numeric, then the data type of the re-
>sult is approximate numeric. The precision of the result is
>implementation-defined.
> "
> 
> And this makes sense to me. I think we should only return an exact result if 
> both of the inputs are exact.
> 
> I think we might want to look closely at the SQL spec and especially when the 
> spec requires an error to be generated. Those are sometimes in the spec to 
> prevent subtle paths to wrong answers. Any time we deviate from the spec we 
> should be asking why is it in the spec and why are we deviating.
> 
> Another issue besides overflow handling is how we determine precision and 
> scale for expressions involving two exact types.
> 
> Ariel
> 
> On Fri, Oct 12, 2018, at 11:51 AM, Michael Burman wrote:
>> Hi,
>> 
>> I'm not sure if I would prefer the Postgres way of doing things, which is
>> returning just about any type depending on the order of operators.
>> Considering it actually mentions in the docs that using numeric/decimal is
>> slow and also multiple times that floating points are inexact. So doing
>> some math with Postgres (9.6.5):
>> 
>> SELECT 2147483647::bigint*1.0::double precision returns double
>> precision 2147483647
>> SELECT 2147483647::bigint*1.0 returns numeric 2147483647.0
>> SELECT 2147483647::bigint*1.0::real returns double
>> SELECT 2147483647::double precision*1::bigint returns double 2147483647
>> SELECT 2147483647::double precision*1.0::bigint returns double 2147483647
>> 
>> With + - we can get the same amount of mixture of returned types. There's
>> no difference in those calculations, just some casting. To me
>> floating-point math indicates inexactness and has errors and whoever mixes
>> up two different types should understand that. If one didn't want exact
>> numeric type, why would the server return such? The floating point value
>> itself could be wrong already before the calculation - trying to say we do
>> it lossless is just wrong.
>> 
>> Fun with 2.65:
>> 
>> SELECT 2.65::real * 1::int returns double 2.6509536743
>> SELECT 2.65::double precision * 1::int returns double 2.65
>> 
>> SELECT round(2.65) returns numeric 4
>> SELECT round(2.65::double precision) returns double 4
>> 
>> SELECT 2.65 * 1 returns double 2.65
>> SELECT 2.65 * 1::bigint returns numeric 2.65
>> SELECT 2.65 * 1.0 returns numeric 2.650
>> SELECT 2.65 * 1.0::double precision returns double 2.65
>> 
>> SELECT round(2.65) * 1 returns numeric 3
>> SELECT round(2.65) * round(1) 

Re: Implicit Casts for Arithmetic Operators

2018-10-12 Thread Ariel Weisberg
Hi,

I agree with what's been said about expectations regarding expressions 
involving floating point numbers. I think that if one of the inputs is 
approximate then the result should be approximate.

One thing we could look at for inspiration is the SQL spec. Not to follow 
dogmatically necessarily.

>From the SQL 92 spec regarding assignment 
>http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt section 4.6:
"
 Values of the data types NUMERIC, DECIMAL, INTEGER, SMALLINT,
 FLOAT, REAL, and DOUBLE PRECISION are numbers and are all mutually
 comparable and mutually assignable. If an assignment would result
 in a loss of the most significant digits, an exception condition
 is raised. If least significant digits are lost, implementation-
 defined rounding or truncating occurs with no exception condition
 being raised. The rules for arithmetic are generally governed by
 Subclause 6.12, "".
"

Section 6.12 numeric value expressions:
"
 1) If the data type of both operands of a dyadic arithmetic opera-
tor is exact numeric, then the data type of the result is exact
numeric, with precision and scale determined as follows:
...
 2) If the data type of either operand of a dyadic arithmetic op-
erator is approximate numeric, then the data type of the re-
sult is approximate numeric. The precision of the result is
implementation-defined.
"

And this makes sense to me. I think we should only return an exact result if 
both of the inputs are exact.

I think we might want to look closely at the SQL spec and especially when the 
spec requires an error to be generated. Those are sometimes in the spec to 
prevent subtle paths to wrong answers. Any time we deviate from the spec we 
should be asking why is it in the spec and why are we deviating.

Another issue besides overflow handling is how we determine precision and scale 
for expressions involving two exact types.

Ariel

On Fri, Oct 12, 2018, at 11:51 AM, Michael Burman wrote:
> Hi,
> 
> I'm not sure if I would prefer the Postgres way of doing things, which is
> returning just about any type depending on the order of operators.
> Considering it actually mentions in the docs that using numeric/decimal is
> slow and also multiple times that floating points are inexact. So doing
> some math with Postgres (9.6.5):
> 
> SELECT 2147483647::bigint*1.0::double precision returns double
> precision 2147483647
> SELECT 2147483647::bigint*1.0 returns numeric 2147483647.0
> SELECT 2147483647::bigint*1.0::real returns double
> SELECT 2147483647::double precision*1::bigint returns double 2147483647
> SELECT 2147483647::double precision*1.0::bigint returns double 2147483647
> 
> With + - we can get the same amount of mixture of returned types. There's
> no difference in those calculations, just some casting. To me
> floating-point math indicates inexactness and has errors and whoever mixes
> up two different types should understand that. If one didn't want exact
> numeric type, why would the server return such? The floating point value
> itself could be wrong already before the calculation - trying to say we do
> it lossless is just wrong.
> 
> Fun with 2.65:
> 
> SELECT 2.65::real * 1::int returns double 2.6509536743
> SELECT 2.65::double precision * 1::int returns double 2.65
> 
> SELECT round(2.65) returns numeric 4
> SELECT round(2.65::double precision) returns double 4
> 
> SELECT 2.65 * 1 returns double 2.65
> SELECT 2.65 * 1::bigint returns numeric 2.65
> SELECT 2.65 * 1.0 returns numeric 2.650
> SELECT 2.65 * 1.0::double precision returns double 2.65
> 
> SELECT round(2.65) * 1 returns numeric 3
> SELECT round(2.65) * round(1) returns double 3
> 
> So as we're going to have silly values in any case, why pretend something
> else? Also, exact calculations are slow if we crunch large amount of
> numbers. I guess I slightly deviated towards Postgres' implemention in this
> case, but I wish it wasn't used as a benchmark in this case. And most
> importantly, I would definitely want the exact same type returned each time
> I do a calculation.
> 
>   - Micke
> 
> On Fri, Oct 12, 2018 at 4:29 PM Benedict Elliott Smith 
> wrote:
> 
> > As far as I can tell we reached a relatively strong consensus that we
> > should implement lossless casts by default?  Does anyone have anything more
> > to add?
> >
> > Looking at the emails, everyone who participated and expressed a
> > preference was in favour of the “Postgres approach” of upcasting to decimal
> > for mixed float/int operands?
> >
> > I’d like to get a clear-cut decision on this, so we know what we’re doing
> > for 4.0.  Then hopefully we can move on to a collective decision on Ariel’s
> > concerns about overflow, which I think are also pressing - particularly for
> > tinyint and smallint.  This does also impact implicit casts for mixed
> > integer type operations, but an approach for 

Re: Implicit Casts for Arithmetic Operators

2018-10-12 Thread Michael Burman
Hi,

I'm not sure if I would prefer the Postgres way of doing things, which is
returning just about any type depending on the order of operators.
Considering it actually mentions in the docs that using numeric/decimal is
slow and also multiple times that floating points are inexact. So doing
some math with Postgres (9.6.5):

SELECT 2147483647::bigint*1.0::double precision returns double
precision 2147483647
SELECT 2147483647::bigint*1.0 returns numeric 2147483647.0
SELECT 2147483647::bigint*1.0::real returns double
SELECT 2147483647::double precision*1::bigint returns double 2147483647
SELECT 2147483647::double precision*1.0::bigint returns double 2147483647

With + - we can get the same amount of mixture of returned types. There's
no difference in those calculations, just some casting. To me
floating-point math indicates inexactness and has errors and whoever mixes
up two different types should understand that. If one didn't want exact
numeric type, why would the server return such? The floating point value
itself could be wrong already before the calculation - trying to say we do
it lossless is just wrong.

Fun with 2.65:

SELECT 2.65::real * 1::int returns double 2.6509536743
SELECT 2.65::double precision * 1::int returns double 2.65

SELECT round(2.65) returns numeric 4
SELECT round(2.65::double precision) returns double 4

SELECT 2.65 * 1 returns double 2.65
SELECT 2.65 * 1::bigint returns numeric 2.65
SELECT 2.65 * 1.0 returns numeric 2.650
SELECT 2.65 * 1.0::double precision returns double 2.65

SELECT round(2.65) * 1 returns numeric 3
SELECT round(2.65) * round(1) returns double 3

So as we're going to have silly values in any case, why pretend something
else? Also, exact calculations are slow if we crunch large amount of
numbers. I guess I slightly deviated towards Postgres' implemention in this
case, but I wish it wasn't used as a benchmark in this case. And most
importantly, I would definitely want the exact same type returned each time
I do a calculation.

  - Micke

On Fri, Oct 12, 2018 at 4:29 PM Benedict Elliott Smith 
wrote:

> As far as I can tell we reached a relatively strong consensus that we
> should implement lossless casts by default?  Does anyone have anything more
> to add?
>
> Looking at the emails, everyone who participated and expressed a
> preference was in favour of the “Postgres approach” of upcasting to decimal
> for mixed float/int operands?
>
> I’d like to get a clear-cut decision on this, so we know what we’re doing
> for 4.0.  Then hopefully we can move on to a collective decision on Ariel’s
> concerns about overflow, which I think are also pressing - particularly for
> tinyint and smallint.  This does also impact implicit casts for mixed
> integer type operations, but an approach for these will probably fall out
> of any decision on overflow.
>
>
>
>
>
>
> > On 3 Oct 2018, at 11:38, Murukesh Mohanan 
> wrote:
> >
> > I think you're conflating two things here. There's the loss resulting
> from
> > using some operators, and loss involved in casting. Dividing an integer
> by
> > another integer to obtain an integer result can result in loss, but
> there's
> > no implicit casting there and no loss due to casting.  Casting an integer
> > to a float can also result in loss. So dividing an integer by a float,
> for
> > example, with an implicit cast has an additional avenue for loss: the
> > implicit cast for the operands so that they're of the same type. I
> believe
> > this discussion so far has been about the latter, not the loss from the
> > operations themselves.
> >
> > On Wed, 3 Oct 2018 at 18:35 Benjamin Lerer 
> > wrote:
> >
> >> Hi,
> >>
> >> I would like to try to clarify things a bit to help people to understand
> >> the true complexity of the problem.
> >>
> >> The *float *and *double *types are inexact numeric types. Not only at
> the
> >> operation level.
> >>
> >> If you insert 676543.21 in a *float* column and then read it, you will
> >> realize that the value has been truncated to 676543.2.
> >>
> >> If you want accuracy the only way is to avoid those inexact types.
> >> Using *decimals
> >> *during operations will mitigate the problem but will not remove it.
> >>
> >>
> >> I do not recall PostgreSQL behaving has described. If I am not mistaken
> in
> >> PostgreSQL *SELECT 3/2* will return *1*. Which is similar to what MS SQL
> >> server and Oracle do. So all thoses databases will lose precision if you
> >> are not carefull.
> >>
> >> If you truly need precision you can have it by using exact numeric types
> >> for your data types. Of course it has a cost on performance, memory and
> >> disk usage.
> >>
> >> The advantage of the current approach is that it give you the choice.
> It is
> >> up to you to decide what you need for your application. It is also in
> line
> >> with the way CQL behave everywhere else.
> >>
> > --
> >
> > Muru
>
>
> -
> To unsubscribe, e-mail: 

Re: Tested to upgrade to 4.0

2018-10-12 Thread Ariel Weisberg
Hi,

Thanks for reporting this. I'll get this fixed today.

Ariel

On Fri, Oct 12, 2018, at 7:21 AM, Tommy Stendahl wrote:
> Hi,
> 
> I tested to upgrade to Cassandra 4.0. I had an existing cluster with 
> 3.0.15 and upgraded the first node but it fails to start due to a 
> NullPointerException.
> 
> The problem is the new table option "speculative_write_threshold", when 
> it doesn’t exist we get a NullPointerException.
> 
> I created a jira for this 
> CASSANDRA-14820.
> 
> Regards,
> Tommy

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Implicit Casts for Arithmetic Operators

2018-10-12 Thread Benedict Elliott Smith
As far as I can tell we reached a relatively strong consensus that we should 
implement lossless casts by default?  Does anyone have anything more to add?

Looking at the emails, everyone who participated and expressed a preference was 
in favour of the “Postgres approach” of upcasting to decimal for mixed 
float/int operands?

I’d like to get a clear-cut decision on this, so we know what we’re doing for 
4.0.  Then hopefully we can move on to a collective decision on Ariel’s 
concerns about overflow, which I think are also pressing - particularly for 
tinyint and smallint.  This does also impact implicit casts for mixed integer 
type operations, but an approach for these will probably fall out of any 
decision on overflow.






> On 3 Oct 2018, at 11:38, Murukesh Mohanan  wrote:
> 
> I think you're conflating two things here. There's the loss resulting from
> using some operators, and loss involved in casting. Dividing an integer by
> another integer to obtain an integer result can result in loss, but there's
> no implicit casting there and no loss due to casting.  Casting an integer
> to a float can also result in loss. So dividing an integer by a float, for
> example, with an implicit cast has an additional avenue for loss: the
> implicit cast for the operands so that they're of the same type. I believe
> this discussion so far has been about the latter, not the loss from the
> operations themselves.
> 
> On Wed, 3 Oct 2018 at 18:35 Benjamin Lerer 
> wrote:
> 
>> Hi,
>> 
>> I would like to try to clarify things a bit to help people to understand
>> the true complexity of the problem.
>> 
>> The *float *and *double *types are inexact numeric types. Not only at the
>> operation level.
>> 
>> If you insert 676543.21 in a *float* column and then read it, you will
>> realize that the value has been truncated to 676543.2.
>> 
>> If you want accuracy the only way is to avoid those inexact types.
>> Using *decimals
>> *during operations will mitigate the problem but will not remove it.
>> 
>> 
>> I do not recall PostgreSQL behaving has described. If I am not mistaken in
>> PostgreSQL *SELECT 3/2* will return *1*. Which is similar to what MS SQL
>> server and Oracle do. So all thoses databases will lose precision if you
>> are not carefull.
>> 
>> If you truly need precision you can have it by using exact numeric types
>> for your data types. Of course it has a cost on performance, memory and
>> disk usage.
>> 
>> The advantage of the current approach is that it give you the choice. It is
>> up to you to decide what you need for your application. It is also in line
>> with the way CQL behave everywhere else.
>> 
> -- 
> 
> Muru


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Tested to upgrade to 4.0

2018-10-12 Thread Tommy Stendahl
Hi,

I tested to upgrade to Cassandra 4.0. I had an existing cluster with 3.0.15 and 
upgraded the first node but it fails to start due to a NullPointerException.

The problem is the new table option "speculative_write_threshold", when it 
doesn’t exist we get a NullPointerException.

I created a jira for this 
CASSANDRA-14820.

Regards,
Tommy


Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-12 Thread Jeff Jirsa




> On Oct 12, 2018, at 6:46 AM, Pavel Yaskevich  wrote:
> 
>> On Thu, Oct 11, 2018 at 4:31 PM Ben Bromhead  wrote:
>> 
>> This is something that's bugged me for ages, tbh the performance gain for
>> most use cases far outweighs the increase in memory usage and I would even
>> be in favor of changing the default now, optimizing the storage cost later
>> (if it's found to be worth it).
>> 
>> For some anecdotal evidence:
>> 4kb is usually what we end setting it to, 16kb feels more reasonable given
>> the memory impact, but what would be the point if practically, most folks
>> set it to 4kb anyway?
>> 
>> Note that chunk_length will largely be dependent on your read sizes, but 4k
>> is the floor for most physical devices in terms of ones block size.
>> 
> 
> It might be worth while to investigate how splitting chunk size into data,
> index and compaction sizes would affect performance.
> 

Data chunk and index chunk are already different (though one is table level and 
one is per instance), but I’m not parsing the compaction comment? 
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org