Re: CASSANDRA-14227 removing the 2038 limit

2023-04-02 Thread Berenguer Blasi
Hi all, assuming lazy consensus here Regards On 22/3/23 15:55, Berenguer Blasi wrote: Hi all, 14227 has undergone review and perf numbers look ok. Now I have to tackle the downgradability issue and hopefully then merge. This is what I have gathered from the many conversations, please help

Re: CASSANDRA-14227 removing the 2038 limit

2023-03-22 Thread Berenguer Blasi
Hi all, 14227 has undergone review and perf numbers look ok. Now I have to tackle the downgradability issue and hopefully then merge. This is what I have gathered from the many conversations, please help me let me know if this is correct or if I am missing sthg: - Everything will be based of

Re: CASSANDRA-14227 removing the 2038 limit

2023-02-03 Thread Henrik Ingo
In that case I agree that increasing from 20 years is an interesting opportunity but clearly out of scope for your current ticket. On Fri, Feb 3, 2023 at 3:48 PM Berenguer Blasi wrote: > Hi, > > 20y is the current and historic value. 68y is what an integer can > accommodate hence the current 203

Re: CASSANDRA-14227 removing the 2038 limit

2023-02-03 Thread Berenguer Blasi
Hi, 20y is the current and historic value. 68y is what an integer can accommodate hence the current 2038 limit since the 1970 Unix epoch. I wouldn't make it a configurable value, off the top of my head it would make for some interesting bugs and debugging sessions when nodes had different val

Re: CASSANDRA-14227 removing the 2038 limit

2023-02-03 Thread Henrik Ingo
Naive PHB questions to follow... Why are 68y and 20y special? Could you pick any value? Could we allow it to be configurable? (Last one probably overkill, just asking to understand...) If we can pick any values we want, instinctively I would personally suggest to have TTL higher than 20 years, bu

Re: CASSANDRA-14227 removing the 2038 limit

2023-02-03 Thread Berenguer Blasi
Hi All, a version using Uints, 20y max TTL and kicking the can down the road until 2086 has been put up for review #justfyi Regards On 15/11/22 7:06, Berenguer Blasi wrote: Hi all, thanks for your answers!. To Benedict's point: In terms of the uvint enconding of deletionTime i.e. it is t

Re: CASSANDRA-14227 removing the 2038 limit

2022-11-14 Thread Berenguer Blasi
Hi all, thanks for your answers!. To Benedict's point: In terms of the uvint enconding of deletionTime i.e. it is true it happens here https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/SerializationHeader.java#L170. But we also have a DeletionTime serializer here

Re: CASSANDRA-14227 removing the 2038 limit

2022-11-14 Thread Josh McKenzie
> in 2035 we'd hit the same problem again. In terms of "kicking a can down the road", this would be a pretty vigorous kick. I wouldn't push back against this deferral. :) On Mon, Nov 14, 2022, at 9:28 AM, Benedict wrote: > > I’m confused why we see *any* increase in sstable size - TTLs and delet

Re: CASSANDRA-14227 removing the 2038 limit

2022-11-14 Thread Benedict
I’m confused why we see *any* increase in sstable size - TTLs and deletion times are already written as unsigned vints as offsets from an sstable epoch for each value. I would dig in more carefully to explore why you’re seeing this increase? For the same data there should be no change to size o

Re: CASSANDRA-14227 removing the 2038 limit

2022-11-13 Thread C. Scott Andreas
A 2-3% increase in storage volume is roughly equivalent to giving up the gain from LZ4 -> LZ4HC, or a one to two-level bump in Zstandard compression levels. This regression could be very expensive for storage-bound use cases.From the perspective of storage overhead, the unsigned int approach sounds

Re: CASSANDRA-14227 removing the 2038 limit

2022-11-13 Thread Berenguer Blasi
Hi all, We have done some more research on c14227. The current patch for CASSANDRA-14227 solves the TTL limit issue by switching TTL to long instead of int. This approach does not have a negative impact on memtable memory usage, as C* controles the memory used by the Memtable, but based on ou

Re: CASSANDRA-14227 removing the 2038 limit

2022-10-18 Thread Berenguer Blasi
Hi, apologies for the late reply as I have been OOO. I have done some profiling and results look virtually identical on trunk and 14227. I have attached some screenshots to the ticket https://issues.apache.org/jira/browse/CASSANDRA-14227. Unless my eyes are fooling me everything in the jfrs l

Re: CASSANDRA-14227 removing the 2038 limit

2022-09-30 Thread Berenguer Blasi
Hi Benedict, thanks for the reply! Yes some profiling is probably needed, then we can see if going down the delta encoding big refactor rabbit hole is worth it? Let's see what other concerns people bring up. Thx. On 29/9/22 11:12, Benedict Elliott Smith wrote: My only slight concern with th

Re: CASSANDRA-14227 removing the 2038 limit

2022-09-29 Thread Benedict Elliott Smith
My only slight concern with this approach is the additional memory pressure. Since 64yrs should be plenty at any moment in time, I wonder if it wouldn’t be better to represent these times as deltas from the nowInSec being used to process the query. So, long math would only be used to normalise

CASSANDRA-14227 removing the 2038 limit

2022-09-29 Thread Berenguer Blasi
Hi all, I have taken a stab in a PR you can find attached in the ticket. Mainly: - I have moved deletion times, gc and nowInSec timestamps to long. That should get us past the 2038 limit. - TTL is maxed now to 68y. Think CQL API compatibility and a sort of a 'free' guardrail. - A new NONE