Re: [EXTERNAL] Two separate rows for the same partition !!
The issue is fixed with nodetool scrub, now both rows are under the same clustering. I'll open a jira to analyze the source of this issue with Cassandra 3.11.3 Thanks. Le jeu. 16 mai 2019 à 04:53, Jeff Jirsa a écrit : > I don’t have a good answer for you - I don’t know if scrub will fix this > (you could copy an sstable offline and try it locally in ccm) - you may > need to delete and reinsert, though I’m really interested in knowing how > this happened if you weren’t ever exposed to #14008. > > Can you open a JIRA? If your sstables aren’t especially sensitive, > uploading them would be swell. Otherwise , an anonymized JSON dump may be > good enough for whichever developer looks at fixing this > > -- > Jeff Jirsa > > > On May 15, 2019, at 7:27 PM, Ahmed Eljami wrote: > > Jeff, In this case is there any solution to resolve that directly in the > sstable (compact, scrub...) or we have to apply a batch on the client level > (delete a partition and re write it)? > > Thank you for your reply. > > Le mer. 15 mai 2019 à 18:09, Ahmed Eljami a > écrit : > >> effectively, this was written in 2.1.14 and we upgrade to 3.11.3 so we >> should not be impacted by this issue ?! >> thanks >> >>
Re: [EXTERNAL] Two separate rows for the same partition !!
I don’t have a good answer for you - I don’t know if scrub will fix this (you could copy an sstable offline and try it locally in ccm) - you may need to delete and reinsert, though I’m really interested in knowing how this happened if you weren’t ever exposed to #14008. Can you open a JIRA? If your sstables aren’t especially sensitive, uploading them would be swell. Otherwise , an anonymized JSON dump may be good enough for whichever developer looks at fixing this -- Jeff Jirsa > On May 15, 2019, at 7:27 PM, Ahmed Eljami wrote: > > Jeff, In this case is there any solution to resolve that directly in the > sstable (compact, scrub...) or we have to apply a batch on the client level > (delete a partition and re write it)? > > Thank you for your reply. > >> Le mer. 15 mai 2019 à 18:09, Ahmed Eljami a écrit : >> effectively, this was written in 2.1.14 and we upgrade to 3.11.3 so we >> should not be impacted by this issue ?! >> thanks >>
Re: [EXTERNAL] Two separate rows for the same partition !!
Jeff, In this case is there any solution to resolve that directly in the sstable (compact, scrub...) or we have to apply a batch on the client level (delete a partition and re write it)? Thank you for your reply. Le mer. 15 mai 2019 à 18:09, Ahmed Eljami a écrit : > effectively, this was written in 2.1.14 and we upgrade to 3.11.3 so we > should not be impacted by this issue ?! > thanks > >
Re: [EXTERNAL] Two separate rows for the same partition !!
effectively, this was written in 2.1.14 and we upgrade to 3.11.3 so we should not be impacted by this issue ?! thanks
Re: [EXTERNAL] Two separate rows for the same partition !!
https://issues.apache.org/jira/browse/CASSANDRA-14008 If this was written in 2.1/2.2 and you upgraded to 3.0.x (x < 16) or 3.1-3.11.1, could be this issue. -- Jeff Jirsa > On May 15, 2019, at 8:43 AM, Ahmed Eljami wrote: > > What about this part of the dump: > > "type" : "row", > "position" : 4123, > "clustering" : [ "", "Token", "abcd", "" ], > "cells" : [ > { "name" : "dvalue", "value" : "", "tstamp" : > "2019-04-26T17:20:39.910Z", "ttl" : 31708792, "expires_at" : > "2020-04-27T17:20:31Z", "expired" : false } > > Why we don't have a liveness_info for this row ? > > Thanks > >> Le mer. 15 mai 2019 à 17:40, Ahmed Eljami a écrit : >> Hi Sean, >> Thanks for reply, >> I'm agree with you about uniquness but when the output of sstabledump show >> that we have the same value for the column g => "clustering" : [ "", >> "Token", "abcd", "" ], >> and when we select with the whole primary key with the valuers wich I see in >> the sstable, cqlsh return 2 rows.. >> >>> Le mer. 15 mai 2019 à 17:27, Durity, Sean R a >>> écrit : >>> Uniqueness is determined by the partition key PLUS the clustering columns. >>> Hard to tell from your data below, but is it possible that one of the >>> clustering columns (perhaps g) has different values? That would easily >>> explain the 2 rows returned – because they ARE different rows in the same >>> partition. In your data model, make sure you need all the clustering >>> columns to determine uniqueness or you will indeed have more rows than you >>> might expect. >>> >>> >>> >>> Sean Durity >>> >>> >>> >>> >>> >>> From: Ahmed Eljami >>> Sent: Wednesday, May 15, 2019 10:56 AM >>> To: user@cassandra.apache.org >>> Subject: [EXTERNAL] Two separate rows for the same partition !! >>> >>> >>> >>> Hi guys, >>> >>> >>> >>> We have a strange problem with the data in cassandra, after inserting twice >>> the same partition with differents columns, we see that cassandra returns 2 >>> rows on cqlsh rather than one...: >>> >>> >>> >>> a| b| c| d| f| g| h| i| j| k| l >>> >>> --++---+--+---+-++---+--++ >>> >>> |bbb| rrr| | Token | abcd|| False | >>> {'expiration': '1557943260838', 'fname': 'WS', 'freshness': >>> '1556299239910'} | null | null >>> >>> |bbb| rrr| | Token | abcd|| null | >>> null | >>>| null >>> >>> >>> >>> With the primary key = PRIMARY KEY ((a, b, c), d, e, f, g) >>> >>> >>> >>> On the sstable we have the following data: >>> >>> >>> >>> [ >>> { >>> "partition" : { >>> "key" : [ "", "bbb", "rrr" ], >>> "position" : 3760 >>> }, >>> "rows" : [ >>> { >>> "type" : "range_tombstone_bound", >>> "start" : { >>> "type" : "inclusive", >>> "clustering" : [ "", "Token", "abcd", "*" ], >>> "deletion_info" : { "marked_deleted" : >>> "2019-04-26T17:20:39.909Z", "local_delete_time" : "2019-04-26T17:20:39Z" } >>> } >>> }, >>> { >>> "type" : "range_tombstone_bound", >>> "end" : { >>> "type" : "exclusive", >>> "clustering" : [ "", "Token", "abcd", "" ], >>> "deletion_info" :
Re: [EXTERNAL] Two separate rows for the same partition !!
What about this part of the dump: "type" : "row", "position" : 4123, "clustering" : [ "", "Token", "abcd", "" ], "cells" : [ { "name" : "dvalue", "value" : "", "tstamp" : "2019-04-26T17:20:39.910Z", "ttl" : 31708792, "expires_at" : "2020-04-27T17:20:31Z", "expired" : false } Why we don't have a *liveness_info* for this row ? Thanks Le mer. 15 mai 2019 à 17:40, Ahmed Eljami a écrit : > Hi Sean, > Thanks for reply, > I'm agree with you about uniquness but when the output of sstabledump > show that we have the same value for the column g => "clustering" : [ > "", "Token", "abcd", "" ], > and when we select with the whole primary key with the valuers wich I see > in the sstable, cqlsh return 2 rows.. > > Le mer. 15 mai 2019 à 17:27, Durity, Sean R > a écrit : > >> Uniqueness is determined by the partition key PLUS the clustering >> columns. Hard to tell from your data below, but is it possible that one of >> the clustering columns (perhaps g) has different values? That would easily >> explain the 2 rows returned – because they ARE different rows in the same >> partition. In your data model, make sure you need all the clustering >> columns to determine uniqueness or you will indeed have more rows than you >> might expect. >> >> >> >> Sean Durity >> >> >> >> >> >> *From:* Ahmed Eljami >> *Sent:* Wednesday, May 15, 2019 10:56 AM >> *To:* user@cassandra.apache.org >> *Subject:* [EXTERNAL] Two separate rows for the same partition !! >> >> >> >> Hi guys, >> >> >> >> We have a strange problem with the data in cassandra, after inserting >> twice the same partition with differents columns, we see that cassandra >> returns 2 rows on cqlsh rather than one...: >> >> >> >> a| b| c| d| f| g| h| i| j| k| l >> >> >> --++---+--+---+-++---+--++ >> >> |bbb| rrr| | Token | abcd|| False | >> {'expiration': '1557943260838', 'fname': 'WS', 'freshness': >> '1556299239910'} | null | null >> >> |bbb| rrr| | Token | abcd|| null | >> >> null || null >> >> >> >> With the primary key = PRIMARY KEY ((a, b, c), d, e, f, g) >> >> >> >> On the sstable we have the following data: >> >> >> >> [ >> { >> "partition" : { >> "key" : [ "", "bbb", "rrr" ], >> "position" : 3760 >> }, >> "rows" : [ >> { >> "type" : "range_tombstone_bound", >> "start" : { >> "type" : "inclusive", >> "clustering" : [ "", "Token", "abcd", "*" ], >> "deletion_info" : { "marked_deleted" : >> "2019-04-26T17:20:39.909Z", "local_delete_time" : "2019-04-26T17:20:39Z" } >> } >> }, >> { >> "type" : "range_tombstone_bound", >> "end" : { >> "type" : "exclusive", >> "clustering" : [ "", "Token", "abcd", "" ], >> "deletion_info" : { "marked_deleted" : >> "2019-04-26T17:20:39.909Z", "local_delete_time" : "2019-04-26T17:20:39Z" } >> } >> }, >> { >> "type" : "row", >> "position" : 3974, >> "clustering" : [ "", "Token", "abcd", "" ], >> "liveness_info" : { "tstamp" : "2019-04-26T17:20:39.910Z", "ttl" >> : 31708792, "expires_at" : "2020-04-27T17:20:31Z", "expired" : false }, >> "cells" : [ >> { "name" : "connected", "value" : false }, >> { "
Re: [EXTERNAL] Two separate rows for the same partition !!
Hi Sean, Thanks for reply, I'm agree with you about uniquness but when the output of sstabledump show that we have the same value for the column g => "clustering" : [ "", "Token", "abcd", "" ], and when we select with the whole primary key with the valuers wich I see in the sstable, cqlsh return 2 rows.. Le mer. 15 mai 2019 à 17:27, Durity, Sean R a écrit : > Uniqueness is determined by the partition key PLUS the clustering columns. > Hard to tell from your data below, but is it possible that one of the > clustering columns (perhaps g) has different values? That would easily > explain the 2 rows returned – because they ARE different rows in the same > partition. In your data model, make sure you need all the clustering > columns to determine uniqueness or you will indeed have more rows than you > might expect. > > > > Sean Durity > > > > > > *From:* Ahmed Eljami > *Sent:* Wednesday, May 15, 2019 10:56 AM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] Two separate rows for the same partition !! > > > > Hi guys, > > > > We have a strange problem with the data in cassandra, after inserting > twice the same partition with differents columns, we see that cassandra > returns 2 rows on cqlsh rather than one...: > > > > a| b| c| d| f| g| h| i| j| k| l > > > --++---+--+---+-++---+--++ > > |bbb| rrr| | Token | abcd|| False | > {'expiration': '1557943260838', 'fname': 'WS', 'freshness': > '1556299239910'} | null | null > > |bbb| rrr| | Token | abcd|| null | > > null || null > > > > With the primary key = PRIMARY KEY ((a, b, c), d, e, f, g) > > > > On the sstable we have the following data: > > > > [ > { > "partition" : { > "key" : [ "", "bbb", "rrr" ], > "position" : 3760 > }, > "rows" : [ > { > "type" : "range_tombstone_bound", > "start" : { > "type" : "inclusive", > "clustering" : [ "", "Token", "abcd", "*" ], > "deletion_info" : { "marked_deleted" : > "2019-04-26T17:20:39.909Z", "local_delete_time" : "2019-04-26T17:20:39Z" } > } > }, > { > "type" : "range_tombstone_bound", > "end" : { > "type" : "exclusive", > "clustering" : [ "", "Token", "abcd", "" ], > "deletion_info" : { "marked_deleted" : > "2019-04-26T17:20:39.909Z", "local_delete_time" : "2019-04-26T17:20:39Z" } > } > }, > { > "type" : "row", > "position" : 3974, > "clustering" : [ "", "Token", "abcd", "" ], > "liveness_info" : { "tstamp" : "2019-04-26T17:20:39.910Z", "ttl" : > 31708792, "expires_at" : "2020-04-27T17:20:31Z", "expired" : false }, > "cells" : [ > { "name" : "connected", "value" : false }, > { "name" : "dattrib", "deletion_info" : { "marked_deleted" : > "2019-04-26T17:20:39.90Z", "local_delete_time" : "2019-04-26T17:20:39Z" > } }, > { "name" : "dattrib", "path" : [ "expiration" ], "value" : > "1557943260838" }, > { "name" : "dattrib", "path" : [ "fname" ], "value" : "WS" }, > { "name" : "dattrib", "path" : [ "freshness" ], "value" : > "1556299239910" } > ] > }, > { > "type" : "row", > "position" : 4123, > "clustering" : [ "", "Token", "abcd", "" ], > "cells" : [ > { "name" : "dvalue", "value" : ""
RE: [EXTERNAL] Two separate rows for the same partition !!
Uniqueness is determined by the partition key PLUS the clustering columns. Hard to tell from your data below, but is it possible that one of the clustering columns (perhaps g) has different values? That would easily explain the 2 rows returned – because they ARE different rows in the same partition. In your data model, make sure you need all the clustering columns to determine uniqueness or you will indeed have more rows than you might expect. Sean Durity From: Ahmed Eljami Sent: Wednesday, May 15, 2019 10:56 AM To: user@cassandra.apache.org Subject: [EXTERNAL] Two separate rows for the same partition !! Hi guys, We have a strange problem with the data in cassandra, after inserting twice the same partition with differents columns, we see that cassandra returns 2 rows on cqlsh rather than one...: a| b| c| d| f| g| h| i| j| k| l --++---+--+---+-++---+--++ |bbb| rrr| | Token | abcd|| False | {'expiration': '1557943260838', 'fname': 'WS', 'freshness': '1556299239910'} | null | null |bbb| rrr| | Token | abcd|| null | null | | null With the primary key = PRIMARY KEY ((a, b, c), d, e, f, g) On the sstable we have the following data: [ { "partition" : { "key" : [ "", "bbb", "rrr" ], "position" : 3760 }, "rows" : [ { "type" : "range_tombstone_bound", "start" : { "type" : "inclusive", "clustering" : [ "", "Token", "abcd", "*" ], "deletion_info" : { "marked_deleted" : "2019-04-26T17:20:39.909Z", "local_delete_time" : "2019-04-26T17:20:39Z" } } }, { "type" : "range_tombstone_bound", "end" : { "type" : "exclusive", "clustering" : [ "", "Token", "abcd", "" ], "deletion_info" : { "marked_deleted" : "2019-04-26T17:20:39.909Z", "local_delete_time" : "2019-04-26T17:20:39Z" } } }, { "type" : "row", "position" : 3974, "clustering" : [ "", "Token", "abcd", "" ], "liveness_info" : { "tstamp" : "2019-04-26T17:20:39.910Z", "ttl" : 31708792, "expires_at" : "2020-04-27T17:20:31Z", "expired" : false }, "cells" : [ { "name" : "connected", "value" : false }, { "name" : "dattrib", "deletion_info" : { "marked_deleted" : "2019-04-26T17:20:39.90Z", "local_delete_time" : "2019-04-26T17:20:39Z" } }, { "name" : "dattrib", "path" : [ "expiration" ], "value" : "1557943260838" }, { "name" : "dattrib", "path" : [ "fname" ], "value" : "WS" }, { "name" : "dattrib", "path" : [ "freshness" ], "value" : "1556299239910" } ] }, { "type" : "row", "position" : 4123, "clustering" : [ "", "Token", "abcd", "" ], "cells" : [ { "name" : "dvalue", "value" : "", "tstamp" : "2019-04-26T17:20:39.910Z", "ttl" : 31708792, "expires_at" : "2020-04-27T17:20:31Z", "expired" : false } ] }, { "type" : "range_tombstone_bound", "start" : { "type" : "exclusive", "clustering" : [ "", "Token", "abcd", "" ], "deletion_info" : { "marked_deleted" : "2019-04-26T17:20:39.909Z", "local_delete_time" : "2019-04-26T17:20:39Z" } } }, { "type" : "range_tombstone_bound", "end" : { "type" : "inclusive", "clustering" : [ "&