[jira] [Comment Edited] (CASSANDRA-19270) Incorrect error type on oversized compound partition key
[ https://issues.apache.org/jira/browse/CASSANDRA-19270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17869336#comment-17869336 ] Nadav Har'El edited comment on CASSANDRA-19270 at 7/29/24 12:21 PM: I checked now on cassandra-5.0-rc1, and the first bug that I reported in this issue wasn't fixed: Trying to insert a compound partition key where one of the components is 65KB, instead of InvalidRequest gives a NoHostAvailable with the string "'H' format requires 0 <= number <= 65535". The second problem I reported in the followup comment (with IllegalArgumentException) indeed seems to be fixed in Cassandra 5. was (Author: nyh): I checked now on cassandra-5.0-rc1, and the first bug that I reported in this issue wasn't fixed: Trying to insert a compound partition key where one of the components is 65KB, instead of InvalidRequest gives a NoHostAvailable with the string "'H' format requires 0 <= number <= 65535". The second problem I reported in the followup comment (with IllegalArgumentException) I can no longer reproduce. > Incorrect error type on oversized compound partition key > > > Key: CASSANDRA-19270 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19270 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > Fix For: 4.0.x, 4.1.x > > > Cassandra limits key lengths (partition and clustering) to 64 KB. If a user > attempts to INSERT data with a partition key or clustering key exceeding that > size, the result is a clear InvalidRequest error with a message like "{{{}Key > length of 66560 is longer than maximum of 65535{}}}". > There is one exception: If you have a *compound* partition key (i.e., two or > more partition key components) and attempt to write one of them larger than > 64 KB, then instead of an orderly InvalidRequest like you got when there was > just one component, now you get a NoHostAvailable with the message: > "{{{}error("'H' format requires 0 <= number <= 65535")}){}}}". This is not > only uglier, it can also confuse the Cassandra driver to retry this request - > because it doesn't realize that the request itself is broken and there is no > point to repeat it. > Interestingly, if there are multiple clustering key columns, this problem > doesn't happen: we still get a nice InvalidRequest if any one of these is > more than 64 KB. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19270) Incorrect error type on oversized compound partition key
[ https://issues.apache.org/jira/browse/CASSANDRA-19270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17869336#comment-17869336 ] Nadav Har'El commented on CASSANDRA-19270: -- I checked now on cassandra-5.0-rc1, and the first bug that I reported in this issue wasn't fixed: Trying to insert a compound partition key where one of the components is 65KB, instead of InvalidRequest gives a NoHostAvailable with the string "'H' format requires 0 <= number <= 65535". The second problem I reported in the followup comment (with IllegalArgumentException) I can no longer reproduce. > Incorrect error type on oversized compound partition key > > > Key: CASSANDRA-19270 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19270 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > Fix For: 4.0.x, 4.1.x > > > Cassandra limits key lengths (partition and clustering) to 64 KB. If a user > attempts to INSERT data with a partition key or clustering key exceeding that > size, the result is a clear InvalidRequest error with a message like "{{{}Key > length of 66560 is longer than maximum of 65535{}}}". > There is one exception: If you have a *compound* partition key (i.e., two or > more partition key components) and attempt to write one of them larger than > 64 KB, then instead of an orderly InvalidRequest like you got when there was > just one component, now you get a NoHostAvailable with the message: > "{{{}error("'H' format requires 0 <= number <= 65535")}){}}}". This is not > only uglier, it can also confuse the Cassandra driver to retry this request - > because it doesn't realize that the request itself is broken and there is no > point to repeat it. > Interestingly, if there are multiple clustering key columns, this problem > doesn't happen: we still get a nice InvalidRequest if any one of these is > more than 64 KB. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19795) In SAI, intersecting two indexes doesn't require ALLOW FILTERING
[ https://issues.apache.org/jira/browse/CASSANDRA-19795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nadav Har'El updated CASSANDRA-19795: - Description: As explained many years ago in https://issues.apache.org/jira/browse/CASSANDRA-5470, when a query involves intersecting two secondary indexes, e.g., "WHERE x=1 AND y=2" where "x" and "y" are two indexed column, ALLOW FILTERING is required. I verified that this is still the case today, in Cassandra 5.0-rc1. But if you use SAI instead of the classic secondary index, suddenly ALLOW FILTERING is not required. I think this is a regression. Even if SAI has a more efficient way of intersecting the posting list from two indexes (does it?), in the worst case this doesn't help: For example, consider a table with a million rows, half have x=1 and the other half have y=2 and just one row has both. Now, a query for "WHERE x=1 AND y=2" needs to process half a million rows just to produce one result. This is ALLOW FILTERING par excellence. was: As explained many years ago in https://issues.apache.org/jira/browse/CASSANDRA-5470, when a query involves intersecting two secondary indexes, e.g., "WHERE x=1 AND y=2" where "x" and "y" are two indexed column, ALLOW FILTERING is required. I verified that this is still the case today, in Cassandra 5.0-rc1. If you use SAI instead of the classic secondary index, suddenly ALLOW FILTERING is not required. I think this is a regression. Even if SAI has a more efficient way of intersecting the posting list from two indexes (does it?), in the worst case this doesn't help: For example, consider a table with a million rows, half have x=1 and the other half have y=2 and just one row has both. Now, a query for "WHERE x=1 AND y=2" needs to process half a million rows just to produce one result. This is ALLOW FILTERING par excellence. > In SAI, intersecting two indexes doesn't require ALLOW FILTERING > > > Key: CASSANDRA-19795 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19795 > Project: Cassandra > Issue Type: Bug > Components: Feature/2i Index >Reporter: Nadav Har'El >Priority: Normal > > As explained many years ago in > https://issues.apache.org/jira/browse/CASSANDRA-5470, when a query involves > intersecting two secondary indexes, e.g., "WHERE x=1 AND y=2" where "x" and > "y" are two indexed column, ALLOW FILTERING is required. > I verified that this is still the case today, in Cassandra 5.0-rc1. > But if you use SAI instead of the classic secondary index, suddenly ALLOW > FILTERING is not required. > I think this is a regression. Even if SAI has a more efficient way of > intersecting the posting list from two indexes (does it?), in the worst case > this doesn't help: For example, consider a table with a million rows, half > have x=1 and the other half have y=2 and just one row has both. Now, a query > for "WHERE x=1 AND y=2" needs to process half a million rows just to produce > one result. This is ALLOW FILTERING par excellence. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19795) In SAI, intersecting two indexes doesn't require ALLOW FILTERING
[ https://issues.apache.org/jira/browse/CASSANDRA-19795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nadav Har'El updated CASSANDRA-19795: - Description: As explained many years ago in https://issues.apache.org/jira/browse/CASSANDRA-5470, when a query involves intersecting two secondary indexes, e.g., "WHERE x=1 AND y=2" where "x" and "y" are two indexed column, ALLOW FILTERING is required. I verified that this is still the case today, in Cassandra 5.0-rc1. If you use SAI instead of the classic secondary index, suddenly ALLOW FILTERING is not required. I think this is a regression. Even if SAI has a more efficient way of intersecting the posting list from two indexes (does it?), in the worst case this doesn't help: For example, consider a table with a million rows, half have x=1 and the other half have y=2 and just one row has both. Now, a query for "WHERE x=1 AND y=2" needs to process half a million rows just to produce one result. This is ALLOW FILTERING par excellence. was: As explained many years ago in https://issues.apache.org/jira/browse/CASSANDRA-5470, when a query involves intersecting two secondary indexes, e.g., "WHERE x=1 AND y=2" where "x" and "y" are two indexed column, ALLOW FILTERING is required. I verified that this is still the case today, in Cassandra 5.0-rc1, but ALLOW FILTERING is suddenly not required for this query if you use SAI instead of the classic secondary index. I think this is a regression. Even if SAI has a more efficient way of intersecting the posting list from two indexes (does it?), in the worst case this doesn't help: For example, consider a table with a million rows, half have x=1 and the other half have y=2 and just one row has both. Now, a query for "WHERE x=1 AND y=2" needs to process half a million rows just to produce one result. This is ALLOW FILTERING par excellence. > In SAI, intersecting two indexes doesn't require ALLOW FILTERING > > > Key: CASSANDRA-19795 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19795 > Project: Cassandra > Issue Type: Bug > Components: Feature/2i Index >Reporter: Nadav Har'El >Priority: Normal > > As explained many years ago in > https://issues.apache.org/jira/browse/CASSANDRA-5470, when a query involves > intersecting two secondary indexes, e.g., "WHERE x=1 AND y=2" where "x" and > "y" are two indexed column, ALLOW FILTERING is required. > I verified that this is still the case today, in Cassandra 5.0-rc1. If you > use SAI instead of the classic secondary index, suddenly ALLOW FILTERING is > not required. > I think this is a regression. Even if SAI has a more efficient way of > intersecting the posting list from two indexes (does it?), in the worst case > this doesn't help: For example, consider a table with a million rows, half > have x=1 and the other half have y=2 and just one row has both. Now, a query > for "WHERE x=1 AND y=2" needs to process half a million rows just to produce > one result. This is ALLOW FILTERING par excellence. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19795) In SAI, intersecting two indexes doesn't require ALLOW FILTERING
Nadav Har'El created CASSANDRA-19795: Summary: In SAI, intersecting two indexes doesn't require ALLOW FILTERING Key: CASSANDRA-19795 URL: https://issues.apache.org/jira/browse/CASSANDRA-19795 Project: Cassandra Issue Type: Bug Components: Feature/2i Index Reporter: Nadav Har'El As explained many years ago in https://issues.apache.org/jira/browse/CASSANDRA-5470, when a query involves intersecting two secondary indexes, e.g., "WHERE x=1 AND y=2" where "x" and "y" are two indexed column, ALLOW FILTERING is required. I verified that this is still the case today, in Cassandra 5.0-rc1, but ALLOW FILTERING is suddenly not required for this query if you use SAI instead of the classic secondary index. I think this is a regression. Even if SAI has a more efficient way of intersecting the posting list from two indexes (does it?), in the worst case this doesn't help: For example, consider a table with a million rows, half have x=1 and the other half have y=2 and just one row has both. Now, a query for "WHERE x=1 AND y=2" needs to process half a million rows just to produce one result. This is ALLOW FILTERING par excellence. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19270) Incorrect error type on oversized compound partition key
[ https://issues.apache.org/jira/browse/CASSANDRA-19270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17807665#comment-17807665 ] Nadav Har'El commented on CASSANDRA-19270: -- Yes, I ran my test on the latest (or so I thought) GA version, Cassandra 4.1.3. I didn't think of testing on Cassandra 5 before reporting it, sorry. Here are reproducers (in Scylla's Python-based test framework, but I'm sure you can figure out what it does): {{@pytest.fixture(scope="module")}} {{def table2(cql, test_keyspace):}} {{ with new_test_table(cql, test_keyspace, "p1 text, p2 text, c1 text, c2 text, PRIMARY KEY ((p1, p2), c1, c2)") as table:}} {{ yield table}} {{def test_insert_65k_pk_compound(cql, table2):}} {{ stmt = cql.prepare(f'INSERT INTO \{table2} (p1, p2, c1, c2) VALUES (?,?,?,?)')}} {{ big = 'x'*(65*1024)}} {{ with pytest.raises(InvalidRequest, match='Key length'):}} {{ cql.execute(stmt, [big, 'dog', 'cat', 'mouse'])}} {{ with pytest.raises(InvalidRequest, match='Key length'):}} {{ cql.execute(stmt, ['dog', big, 'cat', 'mouse'])}} {{def test_insert_65535_compound_pk(cql, table2):}} {{ stmt = cql.prepare(f'INSERT INTO \{table2} (p1, p2, c1, c2) VALUES (?,?,?,?)')}} {{ length = 65535}} {{ p1 = "hello" # not particularly long}} {{ c1 = unique_key_string() # not particularly long}} {{ c2 = unique_key_string() # not particularly long}} {{ p2 = random_string(length=(length-len(p1)-100))}} {{ cql.execute(stmt, [p1, p2, c1, c2])}} {{ stmt = cql.prepare(f'SELECT * FROM \{table2} WHERE p1=? AND p2=?')}} {{ assert list(cql.execute(stmt, [p1, p2])) == [(p1, p2, c1, c2)] }} The first test instead of a clean InvalidRequest as the test expects, get a NoHostAvailable with the strange message "'H' format requires # 0 <= number <= 65535" The second test fails in an even stranger way, with java.lang.IllegalArgumentException. The compound partition key is 100 bytes (roughly) shorter than the maximum, and still it doesn't work. > Incorrect error type on oversized compound partition key > > > Key: CASSANDRA-19270 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19270 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > Cassandra limits key lengths (partition and clustering) to 64 KB. If a user > attempts to INSERT data with a partition key or clustering key exceeding that > size, the result is a clear InvalidRequest error with a message like "{{{}Key > length of 66560 is longer than maximum of 65535{}}}". > There is one exception: If you have a *compound* partition key (i.e., two or > more partition key components) and attempt to write one of them larger than > 64 KB, then instead of an orderly InvalidRequest like you got when there was > just one component, now you get a NoHostAvailable with the message: > "{{{}error("'H' format requires 0 <= number <= 65535")}){}}}". This is not > only uglier, it can also confuse the Cassandra driver to retry this request - > because it doesn't realize that the request itself is broken and there is no > point to repeat it. > Interestingly, if there are multiple clustering key columns, this problem > doesn't happen: we still get a nice InvalidRequest if any one of these is > more than 64 KB. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19270) Incorrect error type on oversized compound partition key
[ https://issues.apache.org/jira/browse/CASSANDRA-19270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17806656#comment-17806656 ] Nadav Har'El commented on CASSANDRA-19270: -- Experimenting some more, I encountered even more bizarre errors when trying to use *compound* partition keys which aren't even oversized. As an example I used a compound partition key (p1,p2) - both strings - trying to insert p1 a 2-character string and p2 a 65433-character string, so the total is 65435 bytes, 100 bytes less than the 65535 maximum - and this should work. But it didn't, and got the strange error: {{E cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', \{: })}} In this case, there was also something in the log: {{09:09:37.671 [Native-Transport-Requests-1] ERROR org.apache.cassandra.transport.}} {{messages.ErrorMessage - Unexpected exception during request}} {{java.lang.IllegalArgumentException: newLimit < 0: (-96 < 0)}} {{ at java.base/java.nio.Buffer.createLimitException(Buffer.java:372)}} {{ at java.base/java.nio.Buffer.limit(Buffer.java:346)}} {{ at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:1107)}} {{ at org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:109)}} {{ at org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:41)}} {{ at org.apache.cassandra.db.marshal.AbstractCompositeType.validate(AbstractCompositeType.java:297)}} {{ at org.apache.cassandra.db.marshal.AbstractCompositeType.validate(AbstractCompositeType.java:275)}} {{ at org.apache.cassandra.cql3.Validation.validateKey(Validation.java:60)}} {{ at org.apache.cassandra.cql3.statements.ModificationStatement.addUpdates(ModificationStatement.java:785)}} {{ at org.apache.cassandra.cql3.statements.ModificationStatement.getMutations(ModificationStatement.java:732)}} {{ at org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:509)}} {{ at org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:491)}} {{ at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:258)}} {{ at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:826)}} {{ at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:804)}} {{ at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:167)}} {{ at org.apache.cassandra.transport.Message$Request.execute(Message.java:255)}} > Incorrect error type on oversized compound partition key > > > Key: CASSANDRA-19270 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19270 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > Cassandra limits key lengths (partition and clustering) to 64 KB. If a user > attempts to INSERT data with a partition key or clustering key exceeding that > size, the result is a clear InvalidRequest error with a message like "{{{}Key > length of 66560 is longer than maximum of 65535{}}}". > There is one exception: If you have a *compound* partition key (i.e., two or > more partition key components) and attempt to write one of them larger than > 64 KB, then instead of an orderly InvalidRequest like you got when there was > just one component, now you get a NoHostAvailable with the message: > "{{{}error("'H' format requires 0 <= number <= 65535")}){}}}". This is not > only uglier, it can also confuse the Cassandra driver to retry this request - > because it doesn't realize that the request itself is broken and there is no > point to repeat it. > Interestingly, if there are multiple clustering key columns, this problem > doesn't happen: we still get a nice InvalidRequest if any one of these is > more than 64 KB. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19270) Incorrect error type on oversized compound partition key
Nadav Har'El created CASSANDRA-19270: Summary: Incorrect error type on oversized compound partition key Key: CASSANDRA-19270 URL: https://issues.apache.org/jira/browse/CASSANDRA-19270 Project: Cassandra Issue Type: Bug Reporter: Nadav Har'El Cassandra limits key lengths (partition and clustering) to 64 KB. If a user attempts to INSERT data with a partition key or clustering key exceeding that size, the result is a clear InvalidRequest error with a message like "{{{}Key length of 66560 is longer than maximum of 65535{}}}". There is one exception: If you have a *compound* partition key (i.e., two or more partition key components) and attempt to write one of them larger than 64 KB, then instead of an orderly InvalidRequest like you got when there was just one component, now you get a NoHostAvailable with the message: "{{{}error("'H' format requires 0 <= number <= 65535")}){}}}". This is not only uglier, it can also confuse the Cassandra driver to retry this request - because it doesn't realize that the request itself is broken and there is no point to repeat it. Interestingly, if there are multiple clustering key columns, this problem doesn't happen: we still get a nice InvalidRequest if any one of these is more than 64 KB. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19019) DESC TYPE forgets to quote UDT's field names
Nadav Har'El created CASSANDRA-19019: Summary: DESC TYPE forgets to quote UDT's field names Key: CASSANDRA-19019 URL: https://issues.apache.org/jira/browse/CASSANDRA-19019 Project: Cassandra Issue Type: Bug Reporter: Nadav Har'El If I create a type with *CREATE TYPE "Quoted_KS"."udt_@@@" (a int, "field_!!!" text)* and then run DESC TYPE "Quoted_KS"."udt_@@@" I get: *CREATE TYPE "Quoted_KS"."udt_@@@" (a int, field_!!! text)* Note the missing quotes around the non-alphanumeric field name, which does need quoting. If I'll try to run this command, it won't work. Tested on Cassandra 4.1. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19006) DROPing a non-existant function with parameter types results in bizarre error
Nadav Har'El created CASSANDRA-19006: Summary: DROPing a non-existant function with parameter types results in bizarre error Key: CASSANDRA-19006 URL: https://issues.apache.org/jira/browse/CASSANDRA-19006 Project: Cassandra Issue Type: Bug Reporter: Nadav Har'El When attempting a command like {{DROP FUNCTION ks.fun(int)}} Where the keyspace "ks" exists, but "fun" doesn't - and note also an attempt to choose which of several (non-existent) overloads to remove - one gets a bizarre error from Cassandra - instead of InvalidRequest (or maybe ConfigurationException), we get a SyntaxError, with the strange message "NoSuchElementException No value present". Neither the SyntaxError type nor this specific message makes much sense. This is not a syntax error, and the same request would have worked if this specific function existed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19005) DROPing an overloaded UDF produces the wrong error message if drop permissions are lacking
Nadav Har'El created CASSANDRA-19005: Summary: DROPing an overloaded UDF produces the wrong error message if drop permissions are lacking Key: CASSANDRA-19005 URL: https://issues.apache.org/jira/browse/CASSANDRA-19005 Project: Cassandra Issue Type: Bug Reporter: Nadav Har'El When a user creates two user-defined functions with the same name but different parameters, to later remove these functions with DROP FUNCTION, the user must disambiguate which one to delete. For example, "DROP FUNCTION ks.fun(int, int)". If the user tries just "DROP FUNCTION ks.fun", Cassandra will return an InvalidRequest, complaining about "multiple functions" with the same name. So far so good. Now, if the user has (via GRANT) permissions to drop only one of these functions and no permissions to drop the second, trying to do "DROP FUNCTION ks.fun" should still return the good old InvalidRequest, because the request is still just as ambiguous as it was when permissions weren't involved. But, Cassandra instead notices that one of the variants, e.g., ks.fun(int, int), doesn't have drop permissions, and returns an Unauthorized error (instead of InvalidRequest), saying that "ks.fun(int, int)" doesn't have drop permissions. This is true - but irrelevant - the user didn't ask to drop that specific overload of the function. Moreover, it's misleading because it can lead the user to GRANT these supposedly-missing permissions, but after granting them, the DROP FUNCTION command still won't work, because it will still be ambiguous. This is a minor error-path bug, but I noticed it while trying to exhaustively look how permissions and functions interact in Cassandra. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18647) CASTing a float to decimal adds wrong digits
[ https://issues.apache.org/jira/browse/CASSANDRA-18647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740047#comment-17740047 ] Nadav Har'El edited comment on CASSANDRA-18647 at 7/5/23 7:35 AM: -- By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is exactly the same function that the Cassandra implementation uses for this purpose, so the implementation and the test have the same bug and the test doesn't verify anything. I found the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: {quote}valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. {quote} So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. And also fix the test, of course. was (Author: nyh): By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: {quote}valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. {quote} So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. And also fix the test, of course. > CASTing a float to decimal adds wrong digits > > > Key: CASSANDRA-18647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18647 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > If I create a table with a *float* (32-bit) column, and cast it to the > *decimal* type, the casting wrongly passes through the double (64-bit) type > and picks up extra, wrong, digits. For example, if we have a column e of type > "float", and run > INSERT INTO tbl (p, e) VALUES (1, 5.2) > SELECT CAST(e AS decimal) FROM tbl WHERE p=1 > The result is the "decimal" value 5.19809265137, with all those extra > wrong digits. It would have been better to get back the decimal value 5.2, > with only two significant digits. > It appears that this happens because Cassandra's implementation first > converts the 32-bit float into a 64-bit double, and only then converts that - > with all the silly extra digits it picked up in the first conversion - into a > "decimal" value. > Contrast this with CAST(e AS text) which works correctly - it returns the > string "5.2" - only the actual digits of the 32-bit floating point value are > converted to the string, without inventing additional digits in the process. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18647) CASTing a float to decimal adds wrong digits
[ https://issues.apache.org/jira/browse/CASSANDRA-18647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740047#comment-17740047 ] Nadav Har'El edited comment on CASSANDRA-18647 at 7/5/23 7:22 AM: -- By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. And also fix the test, of course. was (Author: nyh): By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. And also fix the test, of course. > CASTing a float to decimal adds wrong digits > > > Key: CASSANDRA-18647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18647 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > If I create a table with a *float* (32-bit) column, and cast it to the > *decimal* type, the casting wrongly passes through the double (64-bit) type > and picks up extra, wrong, digits. For example, if we have a column e of type > "float", and run > INSERT INTO tbl (p, e) VALUES (1, 5.2) > SELECT CAST(e AS decimal) FROM tbl WHERE p=1 > The result is the "decimal" value 5.19809265137, with all those extra > wrong digits. It would have been better to get back the decimal value 5.2, > with only two significant digits. > It appears that this happens because Cassandra's implementation first > converts the 32-bit float into a 64-bit double, and only then converts that - > with all the silly extra digits it picked up in the first conversion - into a > "decimal" value. > Contrast this with CAST(e AS text) which works correctly - it returns the > string "5.2" - only the actual digits of the 32-bit floating point value are > converted to the string, without inventing additional digits in the process. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18647) CASTing a float to decimal adds wrong digits
[ https://issues.apache.org/jira/browse/CASSANDRA-18647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740047#comment-17740047 ] Nadav Har'El edited comment on CASSANDRA-18647 at 7/5/23 7:22 AM: -- By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: {quote}valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. {quote} So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. And also fix the test, of course. was (Author: nyh): By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. And also fix the test, of course. > CASTing a float to decimal adds wrong digits > > > Key: CASSANDRA-18647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18647 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > If I create a table with a *float* (32-bit) column, and cast it to the > *decimal* type, the casting wrongly passes through the double (64-bit) type > and picks up extra, wrong, digits. For example, if we have a column e of type > "float", and run > INSERT INTO tbl (p, e) VALUES (1, 5.2) > SELECT CAST(e AS decimal) FROM tbl WHERE p=1 > The result is the "decimal" value 5.19809265137, with all those extra > wrong digits. It would have been better to get back the decimal value 5.2, > with only two significant digits. > It appears that this happens because Cassandra's implementation first > converts the 32-bit float into a 64-bit double, and only then converts that - > with all the silly extra digits it picked up in the first conversion - into a > "decimal" value. > Contrast this with CAST(e AS text) which works correctly - it returns the > string "5.2" - only the actual digits of the 32-bit floating point value are > converted to the string, without inventing additional digits in the process. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18647) CASTing a float to decimal adds wrong digits
[ https://issues.apache.org/jira/browse/CASSANDRA-18647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740047#comment-17740047 ] Nadav Har'El edited comment on CASSANDRA-18647 at 7/5/23 7:21 AM: -- By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. And also fix the test, of course. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. was (Author: nyh): By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. And also fix the test, of course. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. > CASTing a float to decimal adds wrong digits > > > Key: CASSANDRA-18647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18647 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > If I create a table with a *float* (32-bit) column, and cast it to the > *decimal* type, the casting wrongly passes through the double (64-bit) type > and picks up extra, wrong, digits. For example, if we have a column e of type > "float", and run > INSERT INTO tbl (p, e) VALUES (1, 5.2) > SELECT CAST(e AS decimal) FROM tbl WHERE p=1 > The result is the "decimal" value 5.19809265137, with all those extra > wrong digits. It would have been better to get back the decimal value 5.2, > with only two significant digits. > It appears that this happens because Cassandra's implementation first > converts the 32-bit float into a 64-bit double, and only then converts that - > with all the silly extra digits it picked up in the first conversion - into a > "decimal" value. > Contrast this with CAST(e AS text) which works correctly - it returns the > string "5.2" - only the actual digits of the 32-bit floating point value are > converted to the string, without inventing additional digits in the process. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18647) CASTing a float to decimal adds wrong digits
[ https://issues.apache.org/jira/browse/CASSANDRA-18647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740047#comment-17740047 ] Nadav Har'El edited comment on CASSANDRA-18647 at 7/5/23 7:21 AM: -- By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. And also fix the test, of course. was (Author: nyh): By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. And also fix the test, of course. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. > CASTing a float to decimal adds wrong digits > > > Key: CASSANDRA-18647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18647 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > If I create a table with a *float* (32-bit) column, and cast it to the > *decimal* type, the casting wrongly passes through the double (64-bit) type > and picks up extra, wrong, digits. For example, if we have a column e of type > "float", and run > INSERT INTO tbl (p, e) VALUES (1, 5.2) > SELECT CAST(e AS decimal) FROM tbl WHERE p=1 > The result is the "decimal" value 5.19809265137, with all those extra > wrong digits. It would have been better to get back the decimal value 5.2, > with only two significant digits. > It appears that this happens because Cassandra's implementation first > converts the 32-bit float into a 64-bit double, and only then converts that - > with all the silly extra digits it picked up in the first conversion - into a > "decimal" value. > Contrast this with CAST(e AS text) which works correctly - it returns the > string "5.2" - only the actual digits of the 32-bit floating point value are > converted to the string, without inventing additional digits in the process. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18647) CASTing a float to decimal adds wrong digits
[ https://issues.apache.org/jira/browse/CASSANDRA-18647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740047#comment-17740047 ] Nadav Har'El edited comment on CASSANDRA-18647 at 7/5/23 7:21 AM: -- By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. And also fix the test, of course. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. was (Author: nyh): By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. > CASTing a float to decimal adds wrong digits > > > Key: CASSANDRA-18647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18647 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > If I create a table with a *float* (32-bit) column, and cast it to the > *decimal* type, the casting wrongly passes through the double (64-bit) type > and picks up extra, wrong, digits. For example, if we have a column e of type > "float", and run > INSERT INTO tbl (p, e) VALUES (1, 5.2) > SELECT CAST(e AS decimal) FROM tbl WHERE p=1 > The result is the "decimal" value 5.19809265137, with all those extra > wrong digits. It would have been better to get back the decimal value 5.2, > with only two significant digits. > It appears that this happens because Cassandra's implementation first > converts the 32-bit float into a 64-bit double, and only then converts that - > with all the silly extra digits it picked up in the first conversion - into a > "decimal" value. > Contrast this with CAST(e AS text) which works correctly - it returns the > string "5.2" - only the actual digits of the 32-bit floating point value are > converted to the string, without inventing additional digits in the process. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18647) CASTing a float to decimal adds wrong digits
[ https://issues.apache.org/jira/browse/CASSANDRA-18647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740047#comment-17740047 ] Nadav Har'El edited comment on CASSANDRA-18647 at 7/5/23 7:20 AM: -- By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - do *not* use BigDecimal.valueOf(double) on a float. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. was (Author: nyh): By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - not using the float. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. > CASTing a float to decimal adds wrong digits > > > Key: CASSANDRA-18647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18647 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > If I create a table with a *float* (32-bit) column, and cast it to the > *decimal* type, the casting wrongly passes through the double (64-bit) type > and picks up extra, wrong, digits. For example, if we have a column e of type > "float", and run > INSERT INTO tbl (p, e) VALUES (1, 5.2) > SELECT CAST(e AS decimal) FROM tbl WHERE p=1 > The result is the "decimal" value 5.19809265137, with all those extra > wrong digits. It would have been better to get back the decimal value 5.2, > with only two significant digits. > It appears that this happens because Cassandra's implementation first > converts the 32-bit float into a 64-bit double, and only then converts that - > with all the silly extra digits it picked up in the first conversion - into a > "decimal" value. > Contrast this with CAST(e AS text) which works correctly - it returns the > string "5.2" - only the actual digits of the 32-bit floating point value are > converted to the string, without inventing additional digits in the process. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18647) CASTing a float to decimal adds wrong digits
[ https://issues.apache.org/jira/browse/CASSANDRA-18647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740047#comment-17740047 ] Nadav Har'El commented on CASSANDRA-18647: -- By the way, there is a unit test - testNumericCastsInSelectionClause in test/unit/org/apache/cassandra/cql3/functions/CastFctsTest.java - that should have caught this bug. The problem is that it compares the result of the cast not to any specific value but to BigDecimal.valueOf(5.2F), and this BigDecimal.valueOf(float) is apparently the same function that the Cassandra implementation uses for this purpose, so if the implementation has a bug the test doesn't verify anything. I think I know the cause of this bug. It turns out that BigDecimal does *not* have a float overload, only a double. The Java documentation says that: valueOf(double val) Translates a double into a BigDecimal, using the double's canonical string representation provided by the Double.toString(double) method. So the solution of how to turn a float into a Decimal is easy - just use *Float.toString(float)* and then construct a BigDecimal using that string - not using the float. So it seems the fix would be a two-line patch to getDecimalConversionFunction() in src/java/org/apache/cassandra/cql3/functions/CastFcts.java to do that. > CASTing a float to decimal adds wrong digits > > > Key: CASSANDRA-18647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18647 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > If I create a table with a *float* (32-bit) column, and cast it to the > *decimal* type, the casting wrongly passes through the double (64-bit) type > and picks up extra, wrong, digits. For example, if we have a column e of type > "float", and run > INSERT INTO tbl (p, e) VALUES (1, 5.2) > SELECT CAST(e AS decimal) FROM tbl WHERE p=1 > The result is the "decimal" value 5.19809265137, with all those extra > wrong digits. It would have been better to get back the decimal value 5.2, > with only two significant digits. > It appears that this happens because Cassandra's implementation first > converts the 32-bit float into a 64-bit double, and only then converts that - > with all the silly extra digits it picked up in the first conversion - into a > "decimal" value. > Contrast this with CAST(e AS text) which works correctly - it returns the > string "5.2" - only the actual digits of the 32-bit floating point value are > converted to the string, without inventing additional digits in the process. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-18647) CASTing a float to decimal adds wrong digits
Nadav Har'El created CASSANDRA-18647: Summary: CASTing a float to decimal adds wrong digits Key: CASSANDRA-18647 URL: https://issues.apache.org/jira/browse/CASSANDRA-18647 Project: Cassandra Issue Type: Bug Reporter: Nadav Har'El If I create a table with a *float* (32-bit) column, and cast it to the *decimal* type, the casting wrongly passes through the double (64-bit) type and picks up extra, wrong, digits. For example, if we have a column e of type "float", and run INSERT INTO tbl (p, e) VALUES (1, 5.2) SELECT CAST(e AS decimal) FROM tbl WHERE p=1 The result is the "decimal" value 5.19809265137, with all those extra wrong digits. It would have been better to get back the decimal value 5.2, with only two significant digits. It appears that this happens because Cassandra's implementation first converts the 32-bit float into a 64-bit double, and only then converts that - with all the silly extra digits it picked up in the first conversion - into a "decimal" value. Contrast this with CAST(e AS text) which works correctly - it returns the string "5.2" - only the actual digits of the 32-bit floating point value are converted to the string, without inventing additional digits in the process. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18470) Average of "decimal" values rounds the average if all inputs are integers
[ https://issues.apache.org/jira/browse/CASSANDRA-18470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17714628#comment-17714628 ] Nadav Har'El commented on CASSANDRA-18470: -- Benedict, I confirmed your worry in the last paragraph: The fact that the implementation's is keeping only "avg" and not the sum and count separately indeed makes this bug even worse: Today, the AVG of _decimal_ values {*}1{*}, {*}2{*}, *2, 3* comes out as an 1, while the correct result is 2. So the current algorithm can be wrong even if we know a-priori that the result is an integer. > Average of "decimal" values rounds the average if all inputs are integers > - > > Key: CASSANDRA-18470 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18470 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > When running the AVG aggregator on "decimal" values, each value is an > arbitrary-precision number which may be an integer or fractional, but it is > expected that the average would be, in general, fractional. But it turns out > that if all the values are integer *without* a ".0", the aggregator sums them > up as integers and the final division returns an integer too instead of the > fractional response expected from a "decimal" value. > For example: > # AVG of {{decimal}} values 1.0 and 2.0 returns 1.5, as expected. > # AVG of 1.0 and 2 or 1 and 2.0 also return 1.5. > # But AVG of 1 and 2 returns... 1. This is wrong. The user asked for the > average to be a "decimal", not a "varint", so there is no reason why it > should be rounded up to be an integer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18470) Average of "decimal" values rounds the average if all inputs are integers
[ https://issues.apache.org/jira/browse/CASSANDRA-18470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17714578#comment-17714578 ] Nadav Har'El commented on CASSANDRA-18470: -- I did a few more experiments and have a better understanding of the bug. The problem is not just integers vs ".0", but the precision of the inputs: If I have the values 1.1 and 1.2 and calculate the AVG, it comes out as 1.1 instead of 1.15. It appears that the situation we have right now is basically that the result of the division will have exactly as many digits after the decimal point as its inputs have. It's not clear that this is what users would expect. Solving this problem is not trivial - it's not clear which precision we should use for the division. For example, consider averaging 0.0, 0.0 and 1.0. It should result in 0.333. But how many threes? I don't know... Right now averaging 0, 0, 1 will result in 0, averaging 0.0, 0.0 and 1.0 result in 0.3, averaging 0.00, 0.00, 1.00 will result in 0.33, and so on. > Average of "decimal" values rounds the average if all inputs are integers > - > > Key: CASSANDRA-18470 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18470 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Normal > > When running the AVG aggregator on "decimal" values, each value is an > arbitrary-precision number which may be an integer or fractional, but it is > expected that the average would be, in general, fractional. But it turns out > that if all the values are integer *without* a ".0", the aggregator sums them > up as integers and the final division returns an integer too instead of the > fractional response expected from a "decimal" value. > For example: > # AVG of {{decimal}} values 1.0 and 2.0 returns 1.5, as expected. > # AVG of 1.0 and 2 or 1 and 2.0 also return 1.5. > # But AVG of 1 and 2 returns... 1. This is wrong. The user asked for the > average to be a "decimal", not a "varint", so there is no reason why it > should be rounded up to be an integer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-18470) Average of "decimal" values rounds the average if all inputs are integers
Nadav Har'El created CASSANDRA-18470: Summary: Average of "decimal" values rounds the average if all inputs are integers Key: CASSANDRA-18470 URL: https://issues.apache.org/jira/browse/CASSANDRA-18470 Project: Cassandra Issue Type: Bug Reporter: Nadav Har'El When running the AVG aggregator on "decimal" values, each value is an arbitrary-precision number which may be an integer or fractional, but it is expected that the average would be, in general, fractional. But it turns out that if all the values are integer *without* a ".0", the aggregator sums them up as integers and the final division returns an integer too instead of the fractional response expected from a "decimal" value. For example: # AVG of {{decimal}} values 1.0 and 2.0 returns 1.5, as expected. # AVG of 1.0 and 2 or 1 and 2.0 also return 1.5. # But AVG of 1 and 2 returns... 1. This is wrong. The user asked for the average to be a "decimal", not a "varint", so there is no reason why it should be rounded up to be an integer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16635) dml.rst should not list a "!=" operator
Nadav Har'El created CASSANDRA-16635: Summary: dml.rst should not list a "!=" operator Key: CASSANDRA-16635 URL: https://issues.apache.org/jira/browse/CASSANDRA-16635 Project: Cassandra Issue Type: Improvement Reporter: Nadav Har'El In {{doc/source/cql/dml.rst}} (which ends up in [https://cassandra.apache.org/doc/latest/cql/dml.html),] one of the operators listed is "!=". However this operator has never been supported in WHERE clauses, and I don't see any plans to making it supported. The confusion compounds when you notice that the text does refer in a few places to "non-equal" or "inequality" operators - but those refer to operators like "<=" which are allowed in certain places and not others - not to "!=" which isn't allowed anywhere. So "!=" should not be listed at all. The Datastax version of this document, [https://docs.datastax.com/en/cql-oss/3.3/cql/cql_reference/cqlSelect.html,] also doesn't list "!=". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys
[ https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922466#comment-16922466 ] Nadav Har'El commented on CASSANDRA-9928: - This issue has recently turned 4 years old, and I'm curious how sure we are about the *reasons* described above for why we forbid MV with two new key columns - whether these reasons are correct, and whether we are sure these are the only reasons. As [~fsander] asked above, while a base-view inconsistency is indeed more likely in the two-new-key-columns case, don't we have the same problem in the regular one-new-key-column case - of scenarios where an unfortunate order of node failures cause data to appear in a view replica which doesn't appear in the base replica, and thus will never be deleted? I thought this was one of the main reasons why MV was recently downgraded to "experimental" status. But I also wonder if we didn't miss a second problem, that of row liveness, similar to what we have in the case of unselected columns (see [CASSANDRA-13826)|https://jira.apache.org/jira/browse/CASSANDRA-13826)] where if we add and remove different base columns which are view keys, but the view row has just a *single* timestamp, we can end up being unable to add a view row that we previously deleted. For example, here is a scenario I thought might be problematic (didn't actually test this, one would need to disable the check in the code forbidding multiple new MV key columns to run a test case): Assume that x,y are regular column in base, but key columns in the view. For brevity, we leave out other base key columns and other regular columns. Consider the following sequence of events on one row of the base table: # Add x=1 at timestamp 1. Since y is still null, no view row is created yet. # Add y=1 at timestamp 10. This creates a view row with key x=1, y=1. The row only contains a CQL row marker, and a single timestamp is chosen for it: 10. # Delete x at timestamp 2. This deletes x’s older (ts=1) value, and so the view row should be deleted. Again, a timestamp needs to be chosen for this deletion - it will be 10 again, and the deletion will override the creation with the same timestamp from the previous step, and so far everything is fine. # Add x=2 at timestamp 3. This overrides the deletion of x (which was in timestamp 2) so again, both x and y have values and a view row should be created with key x=2, y=1. However, this creation will again have timestamp 10 (y’s timestamp) and not be able to shadow the deletion from step 3 (in step 3, deletion won over data, so here it will win again). So the view row we wanted to add will not be added! > Add Support for multiple non-primary key columns in Materialized View primary > keys > -- > > Key: CASSANDRA-9928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9928 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Materialized Views >Reporter: T Jake Luciani >Priority: Normal > Labels: materializedviews > Fix For: 4.x > > > Currently we don't allow > 1 non primary key from the base table in a MV > primary key. We should remove this restriction assuming we continue > filtering out nulls. With allowing nulls in the MV columns there are a lot > of multiplicative implications we need to think through. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14478) Improve the documentation of UPDATE vs INSERT
Nadav Har'El created CASSANDRA-14478: Summary: Improve the documentation of UPDATE vs INSERT Key: CASSANDRA-14478 URL: https://issues.apache.org/jira/browse/CASSANDRA-14478 Project: Cassandra Issue Type: Improvement Components: Documentation and Website Reporter: Nadav Har'El New Cassandra users often wonder about the difference between the INSERT and UPDATE cql commands when applied to ordinary data (not counters or transactions). Usually, they are told them that there is really no difference between the two - both of them can insert a new row or update an existing one. The Cassandra CQL documentation [http://cassandra.apache.org/doc/latest/cql/dml.html#update|http://cassandra.apache.org/doc/latest/cql/dml.html#update,] is fairly silent on the question - on the one hand it doesn't explicitly say they are the same, but on the other hand describes them both as doing the same things, and doesn't explicitly mention any difference. But there is an important difference, which was raised in the past in CASSANDRA-11805: INSERT adds a row marker, while UPDATE does not. What does this mean? Basically an UPDATE requests that individual cells of the row be added, but not that the row itself be added; So if one later deletes the same individual cells with DELETE, the entire row goes away. However, an "INSERT" not only adds the cells, it also requests that the row be added (this is implemented via a "row marker"). So if later all the row's individual cells are deleted, an empty row remains behind (i.e., the primary of the row which now has no content is still remembered in the table). I'm not sure what is the best way to explain this, but what I wrote in the paragraph above is a start. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14262) View update sent multiple times during range movement
Nadav Har'El created CASSANDRA-14262: Summary: View update sent multiple times during range movement Key: CASSANDRA-14262 URL: https://issues.apache.org/jira/browse/CASSANDRA-14262 Project: Cassandra Issue Type: Improvement Components: Materialized Views Reporter: Nadav Har'El This issue is about updating a base table with materialized views while token-ranges are being moved, i.e., while a node is being added or removed from the cluster (this is a long process because the data needs to be streamed to its new owning node). During this process, each view-mutation we want to write to a view table may have an additional "pending node" (or several of them) - another node (or nodes) which will hold this view mutation, and we need to send the view mutations to these new nodes too. This code existed until CASSANDRA-13069, when it was accidentally removed, and returned in CASSANDRA-14251. However, the current code, in mutateMV(), has each of the RF (e.g., 3) base replicas send the view mutation to the the same pending node. This is of course redundant, and reduces write throughput while the streaming is performed. I suggested (based on an idea by [~shlomi_livne]) that it may be enough for only the single node which will be paired (when the range movement completes) with the pending node to send it the update. [~pauloricardomg] replied (see [https://lists.apache.org/thread.html/12c78582a3f709ca33a45e5fa6121148b1b1ad9c9b290d1a21e4409b@%3Cdev.cassandra.apache.org%3E] ) that it appears that such an optimization would work in the common case of single movements but will not work in rarer more complex cases (I did not fully understand the details, check out the above link for the details). I believe there's another problem with the current code, which is of correctness: If any view replica ends up with two different view rows for the same partition key, such a mistake cannot currently be fixed (see CASSANDRA-10346). But if we have different base replicas with two different values (a consistency an ordinary base repair could fix, if we ran it) and both of them send their update to the same pending view replica, this view replica will now have two rows, one of them wrong (and cannot currently be repaired). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-10728) Hash used in repair does not include partition key
[ https://issues.apache.org/jira/browse/CASSANDRA-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15020906#comment-15020906 ] Nadav Har'El commented on CASSANDRA-10728: -- Identical values, yes, but not identical keys... > Hash used in repair does not include partition key > -- > > Key: CASSANDRA-10728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10728 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Minor > > When the repair code builds the Merkle Tree, it appears to be using > AbstractCompactedRow.update() to calculate a partition's hash. This method's > documentation states that it calculates a "digest with the data bytes of the > row (not including row key or row size).". The code itself seems to agree > with this comment. > However, I believe that not including the row (actually, partition) key in > the hash function is a mistake: This means that if two nodes have the same > data but different key, repair would not notice this discrepancy. Moreover, > if two different keys have their data switched - or have the same data - > again this would not be noticed by repair. Actually running across this > problem in a real repair is not very likely, but I can imagine seeing it > easily in an hypothetical use case where all partitions have exactly the same > data and just the partition key matters. > I am sorry if I'm mistaken and the partition key is actually taken into > account in the Merkle tree, but I tried to find evidence that it does and > failed. Glancing over the code, it almost seems that it does use the key: > Validator.add() calculates rowHash() which includes the digest (without the > partition key) *and* the key's token. But then, the code calls > MerkleTree.TreeRange.addHash() on that tuple, and that function conspicuously > ignores the token, and only uses the digest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10728) Hash used in repair does not include partition key
Nadav Har'El created CASSANDRA-10728: Summary: Hash used in repair does not include partition key Key: CASSANDRA-10728 URL: https://issues.apache.org/jira/browse/CASSANDRA-10728 Project: Cassandra Issue Type: Bug Reporter: Nadav Har'El Priority: Minor When the repair code builds the Merkle Tree, it appears to be using AbstractCompactedRow.update() to calculate a partition's hash. This method's documentation states that it calculates a "digest with the data bytes of the row (not including row key or row size).". The code itself seems to agree with this comment. However, I believe that not including the row (actually, partition) key in the hash function is a mistake: This means that if two nodes have the same data but different key, repair would not notice this discrepancy. Moreover, if two different keys have their data switched - or have the same data - again this would not be noticed by repair. Actually running across this problem in a real repair is not very likely, but I can imagine seeing it easily in an hypothetical use case where all partitions have exactly the same data and just the partition key matters. I am sorry if I'm mistaken and the partition key is actually taken into account in the Merkle tree, but I tried to find evidence that it does and failed. Glancing over the code, it almost seems that it does use the key: Validator.add() calculates rowHash() which includes the digest (without the partition key) *and* the key's token. But then, the code calls MerkleTree.TreeRange.addHash() on that tuple, and that function conspicuously ignores the token, and only uses the digest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10728) Hash used in repair does not include partition key
[ https://issues.apache.org/jira/browse/CASSANDRA-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1508#comment-1508 ] Nadav Har'El commented on CASSANDRA-10728: -- What if in one replica we have partition 'a' with value 1 and some timestamp, and in a second replica, we have partition 'b' with a value 1 and the same timestamp - and 'a' and 'b' happen to be close enough in their tokens to be in the same Merkle Tree partition range? I realize this is very unlikely case (especially considering the need for the timestamps to be identical, which I haven't considered before). But it seems it's possible... For example, consider a contrieved use case which uses Cassandra to store a large set of keys - the value of each key is always set to "1" (or whatever). Now, at exactly the same time (at millisecond resolution, which is Cassandra's default), two servers want to write two different keys "a" and "b" - and because of a partition in the cluster, "a" ends up on one machine, "b" on a second machine - and both have the same time (in miilisecond resolution, it's not completely improbable). > Hash used in repair does not include partition key > -- > > Key: CASSANDRA-10728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10728 > Project: Cassandra > Issue Type: Bug >Reporter: Nadav Har'El >Priority: Minor > > When the repair code builds the Merkle Tree, it appears to be using > AbstractCompactedRow.update() to calculate a partition's hash. This method's > documentation states that it calculates a "digest with the data bytes of the > row (not including row key or row size).". The code itself seems to agree > with this comment. > However, I believe that not including the row (actually, partition) key in > the hash function is a mistake: This means that if two nodes have the same > data but different key, repair would not notice this discrepancy. Moreover, > if two different keys have their data switched - or have the same data - > again this would not be noticed by repair. Actually running across this > problem in a real repair is not very likely, but I can imagine seeing it > easily in an hypothetical use case where all partitions have exactly the same > data and just the partition key matters. > I am sorry if I'm mistaken and the partition key is actually taken into > account in the Merkle tree, but I tried to find evidence that it does and > failed. Glancing over the code, it almost seems that it does use the key: > Validator.add() calculates rowHash() which includes the digest (without the > partition key) *and* the key's token. But then, the code calls > MerkleTree.TreeRange.addHash() on that tuple, and that function conspicuously > ignores the token, and only uses the digest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)