Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056)
> I may be wrong here, but the CEP directly calls out making this api public > for people who wish to replace the SSTable format I don’t think this implies API stability. For starters, it doesn’t stipulate that these implementations will be supported out of tree (the only one I’m aware of, so far as I understand, is intended to be incubated in tree), nor does an API for external usage have to be stable. It’s fine to create an API and tell users it’s unstable, and that they should closely monitor patch version changes if they use it. That said, norms may be changing around what can go into patch releases anyhow, so this may be a lot of noise about nothing. If all new development goes into trunk, then it’s all moot. But I don’t think we can make hard assumptions about that today, as historically these sorts of intentions haven’t lasted. I’m fairly against the idea of introducing hard restrictions on this, and potentially ossifying the codebase. I’m not keen to even consider out of tree consumers of these APIs in any way, for compatibility, upgradeability or anything. There’s a lot that needs to be done over the coming years to improve the internal structure of the project, and unduly entrenching the current state of affairs would be a huge potential harm of these efforts to modularise the codebase. From: David Capwell Date: Tuesday, 9 November 2021 at 23:38 To: dev@cassandra.apache.org Subject: Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056) > My understanding is that the only interface that is expected to be stable for > external consumers is the secondary index API I may be wrong here, but the CEP directly calls out making this api public for people who wish to replace the SSTable format ("Cassandra developers who want to develop and publish different file format implementations."), so if we need to support 2i API, why would we not support SSTable API as well? > All of the other mentioned APIs are in my opinion for internal usage only This gets back to my point; it is currently tribal knowledge what needs to work and what doesn’t, and without the broader set of committers knowing this then the likely hood any new API will break in a minor is high. > On Nov 9, 2021, at 12:13 PM, bened...@apache.org wrote: > > I agree that we don’t need to block the CEP on this, and that we should have > that discussion. But it’s worth noting that the CEP should not anticipate or > depend on any specific outcome of that discussion. > > Since it is somewhat relevant for this discussion, my view is that no > interface should be assumed to be stable without the prior explicit agreement > of the community. > > My understanding is that the only interface that is expected to be stable for > external consumers is the secondary index API. Perhaps also snitches? But > also perhaps not, as the difficulty of upgrading these at the same time is > pretty low for custom snitches. All of the other mentioned APIs are in my > opinion for internal usage only, so users should not assume compile time > compatibility across any release, and I am certain we have never tried to > maintained this. This still facilitates forks of course, by localising the > compatibility work. > > > From: Jeremiah D Jordan > Date: Tuesday, 9 November 2021 at 19:43 > To: Cassandra DEV > Subject: Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056) > I would love to have this discussion and setup annotations or similar to > formalize things. I just do not think we need to hold any up CEPs to do so. > That discussion should possibly be a CEP of its own proposing how we want to > formalize interfaces? I would be happy to go through and try to put together > something for that or since you feel so strongly about it maybe you want to > David? At the very least it should get its own DISCUSS thread and then be > written up in the wiki. > > -Jeremiah > >> On Nov 9, 2021, at 1:06 PM, Joshua McKenzie wrote: >> >>> >>> trunk -> anything goes, not trunk -> try not to change these interfaces >> >> Have we ever clarified what "these interfaces" are? Was just talking to >> David and I realized I didn't even JavaDoc CommitLogReadHandler as _being >> designed_ for external usage. /sigh >> >> I think it'd be valuable for us to go through the codebase and annotate >> interfaces as intended to be exposed to 3rd parties; this has bothered me >> for years. Especially as we come up on a large number of new cleanups, >> refactorings, and potentially genericizing some subsystems into API's >> (CEP-18 descendents). >> >> >> On Tue, Nov 9, 2021 at 2:01 PM David Capwell >> wrote: >> We already have many interfaces similar to these for Compaction >>> Strategy, Indexing, Query Handler. >>> >>> Today-I-Learned QueryHandler is not allowed to be touched in a minor… good >>> to know… >>> not trunk -> try not to change these interfaces >>> >>> Outside of MBeans, I honestly do not know what interfaces fall into this >>>
Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056)
> My understanding is that the only interface that is expected to be stable for > external consumers is the secondary index API I may be wrong here, but the CEP directly calls out making this api public for people who wish to replace the SSTable format ("Cassandra developers who want to develop and publish different file format implementations."), so if we need to support 2i API, why would we not support SSTable API as well? > All of the other mentioned APIs are in my opinion for internal usage only This gets back to my point; it is currently tribal knowledge what needs to work and what doesn’t, and without the broader set of committers knowing this then the likely hood any new API will break in a minor is high. > On Nov 9, 2021, at 12:13 PM, bened...@apache.org wrote: > > I agree that we don’t need to block the CEP on this, and that we should have > that discussion. But it’s worth noting that the CEP should not anticipate or > depend on any specific outcome of that discussion. > > Since it is somewhat relevant for this discussion, my view is that no > interface should be assumed to be stable without the prior explicit agreement > of the community. > > My understanding is that the only interface that is expected to be stable for > external consumers is the secondary index API. Perhaps also snitches? But > also perhaps not, as the difficulty of upgrading these at the same time is > pretty low for custom snitches. All of the other mentioned APIs are in my > opinion for internal usage only, so users should not assume compile time > compatibility across any release, and I am certain we have never tried to > maintained this. This still facilitates forks of course, by localising the > compatibility work. > > > From: Jeremiah D Jordan > Date: Tuesday, 9 November 2021 at 19:43 > To: Cassandra DEV > Subject: Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056) > I would love to have this discussion and setup annotations or similar to > formalize things. I just do not think we need to hold any up CEPs to do so. > That discussion should possibly be a CEP of its own proposing how we want to > formalize interfaces? I would be happy to go through and try to put together > something for that or since you feel so strongly about it maybe you want to > David? At the very least it should get its own DISCUSS thread and then be > written up in the wiki. > > -Jeremiah > >> On Nov 9, 2021, at 1:06 PM, Joshua McKenzie wrote: >> >>> >>> trunk -> anything goes, not trunk -> try not to change these interfaces >> >> Have we ever clarified what "these interfaces" are? Was just talking to >> David and I realized I didn't even JavaDoc CommitLogReadHandler as _being >> designed_ for external usage. /sigh >> >> I think it'd be valuable for us to go through the codebase and annotate >> interfaces as intended to be exposed to 3rd parties; this has bothered me >> for years. Especially as we come up on a large number of new cleanups, >> refactorings, and potentially genericizing some subsystems into API's >> (CEP-18 descendents). >> >> >> On Tue, Nov 9, 2021 at 2:01 PM David Capwell >> wrote: >> We already have many interfaces similar to these for Compaction >>> Strategy, Indexing, Query Handler. >>> >>> Today-I-Learned QueryHandler is not allowed to be touched in a minor… good >>> to know… >>> not trunk -> try not to change these interfaces >>> >>> Outside of MBeans, I honestly do not know what interfaces fall into this >>> group; and for MBeans we have tests which block breaking changes. The >>> point I am making is that not everyone is aware of the rules, so having >>> something in place to help enforce such rules should be thought about; if >>> we want to add pluggable hooks with the intent that external parties can >>> leverage such hooks, we should also add to the scope the maintenance of >>> these interfaces (we should not assume “tribal knowledge” will work). >>> >>> I am not trying to ask for something large or something requiring a ton of >>> work, I am just asking that this gets thought about during the project so >>> it doesn’t get neglected. This could be as simple as an annotation like >>> @ExposedTo3rdParties (Hadoop does this to show an interface is exposed and >>> must be maintained), or it could be something like split directories >>> (src/java = private, src/java-exposed = public); I am trying not to dictate >>> an implementation, only trying to make sure we are setup to support the CEP >>> after the work is done. >>> >>> On Nov 9, 2021, at 9:52 AM, Jeremiah D Jordan >>> wrote: We already have many interfaces similar to these for Compaction >>> Strategy, Indexing, Query Handler. I would hope that commiters are already >>> following a policy along the lines of trunk -> anything goes, not trunk -> >>> try not to change these interfaces. I would expect that to be the same >>> policy for any new internal interfaces that are added. But g
Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056)
I agree that we don’t need to block the CEP on this, and that we should have that discussion. But it’s worth noting that the CEP should not anticipate or depend on any specific outcome of that discussion. Since it is somewhat relevant for this discussion, my view is that no interface should be assumed to be stable without the prior explicit agreement of the community. My understanding is that the only interface that is expected to be stable for external consumers is the secondary index API. Perhaps also snitches? But also perhaps not, as the difficulty of upgrading these at the same time is pretty low for custom snitches. All of the other mentioned APIs are in my opinion for internal usage only, so users should not assume compile time compatibility across any release, and I am certain we have never tried to maintained this. This still facilitates forks of course, by localising the compatibility work. From: Jeremiah D Jordan Date: Tuesday, 9 November 2021 at 19:43 To: Cassandra DEV Subject: Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056) I would love to have this discussion and setup annotations or similar to formalize things. I just do not think we need to hold any up CEPs to do so. That discussion should possibly be a CEP of its own proposing how we want to formalize interfaces? I would be happy to go through and try to put together something for that or since you feel so strongly about it maybe you want to David? At the very least it should get its own DISCUSS thread and then be written up in the wiki. -Jeremiah > On Nov 9, 2021, at 1:06 PM, Joshua McKenzie wrote: > >> >> trunk -> anything goes, not trunk -> try not to change these interfaces > > Have we ever clarified what "these interfaces" are? Was just talking to > David and I realized I didn't even JavaDoc CommitLogReadHandler as _being > designed_ for external usage. /sigh > > I think it'd be valuable for us to go through the codebase and annotate > interfaces as intended to be exposed to 3rd parties; this has bothered me > for years. Especially as we come up on a large number of new cleanups, > refactorings, and potentially genericizing some subsystems into API's > (CEP-18 descendents). > > > On Tue, Nov 9, 2021 at 2:01 PM David Capwell > wrote: > >>> We already have many interfaces similar to these for Compaction >> Strategy, Indexing, Query Handler. >> >> Today-I-Learned QueryHandler is not allowed to be touched in a minor… good >> to know… >> >>> not trunk -> try not to change these interfaces >> >> Outside of MBeans, I honestly do not know what interfaces fall into this >> group; and for MBeans we have tests which block breaking changes. The >> point I am making is that not everyone is aware of the rules, so having >> something in place to help enforce such rules should be thought about; if >> we want to add pluggable hooks with the intent that external parties can >> leverage such hooks, we should also add to the scope the maintenance of >> these interfaces (we should not assume “tribal knowledge” will work). >> >> I am not trying to ask for something large or something requiring a ton of >> work, I am just asking that this gets thought about during the project so >> it doesn’t get neglected. This could be as simple as an annotation like >> @ExposedTo3rdParties (Hadoop does this to show an interface is exposed and >> must be maintained), or it could be something like split directories >> (src/java = private, src/java-exposed = public); I am trying not to dictate >> an implementation, only trying to make sure we are setup to support the CEP >> after the work is done. >> >> >>> On Nov 9, 2021, at 9:52 AM, Jeremiah D Jordan >> wrote: >>> >>> We already have many interfaces similar to these for Compaction >> Strategy, Indexing, Query Handler. I would hope that commiters are already >> following a policy along the lines of trunk -> anything goes, not trunk -> >> try not to change these interfaces. I would expect that to be the same >> policy for any new internal interfaces that are added. But given we >> already have many such interfaces, I see no reason to block adding more of >> them while change policies are discussed. >>> >>> -Jeremiah >>> On Nov 9, 2021, at 10:44 AM, David Capwell >> wrote: I still have one outstanding comment, but this is a comment for several >> of the CEPs being worked on > And last comment, which I have also done in the other modularity >> thread… backwards compatibility and maintenance. It is not clear right now >> what java interfaces may not break and how we can maintain and extend such >> interfaces in the future. If the goal is to allow 3rd parties to plugin >> and offer new SSTable formats, are we as a project ok with having a minor >> release do a binary or source non-compatible change? If not how do we >> detect this? Until this problem is solved, I do not think we should add >> any such interfaces. I would love some clarity on th
Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056)
> I would be happy to go through and try to put together something for that ... > At the very least it should get its own DISCUSS thread and then be written > up in the wiki. +1. Thanks. > On Nov 9, 2021, at 11:43 AM, Jeremiah D Jordan > wrote: > > I would love to have this discussion and setup annotations or similar to > formalize things. I just do not think we need to hold any up CEPs to do so. > That discussion should possibly be a CEP of its own proposing how we want to > formalize interfaces? I would be happy to go through and try to put together > something for that or since you feel so strongly about it maybe you want to > David? At the very least it should get its own DISCUSS thread and then be > written up in the wiki. > > -Jeremiah > >> On Nov 9, 2021, at 1:06 PM, Joshua McKenzie wrote: >> >>> >>> trunk -> anything goes, not trunk -> try not to change these interfaces >> >> Have we ever clarified what "these interfaces" are? Was just talking to >> David and I realized I didn't even JavaDoc CommitLogReadHandler as _being >> designed_ for external usage. /sigh >> >> I think it'd be valuable for us to go through the codebase and annotate >> interfaces as intended to be exposed to 3rd parties; this has bothered me >> for years. Especially as we come up on a large number of new cleanups, >> refactorings, and potentially genericizing some subsystems into API's >> (CEP-18 descendents). >> >> >> On Tue, Nov 9, 2021 at 2:01 PM David Capwell >> wrote: >> We already have many interfaces similar to these for Compaction >>> Strategy, Indexing, Query Handler. >>> >>> Today-I-Learned QueryHandler is not allowed to be touched in a minor… good >>> to know… >>> not trunk -> try not to change these interfaces >>> >>> Outside of MBeans, I honestly do not know what interfaces fall into this >>> group; and for MBeans we have tests which block breaking changes. The >>> point I am making is that not everyone is aware of the rules, so having >>> something in place to help enforce such rules should be thought about; if >>> we want to add pluggable hooks with the intent that external parties can >>> leverage such hooks, we should also add to the scope the maintenance of >>> these interfaces (we should not assume “tribal knowledge” will work). >>> >>> I am not trying to ask for something large or something requiring a ton of >>> work, I am just asking that this gets thought about during the project so >>> it doesn’t get neglected. This could be as simple as an annotation like >>> @ExposedTo3rdParties (Hadoop does this to show an interface is exposed and >>> must be maintained), or it could be something like split directories >>> (src/java = private, src/java-exposed = public); I am trying not to dictate >>> an implementation, only trying to make sure we are setup to support the CEP >>> after the work is done. >>> >>> On Nov 9, 2021, at 9:52 AM, Jeremiah D Jordan >>> wrote: We already have many interfaces similar to these for Compaction >>> Strategy, Indexing, Query Handler. I would hope that commiters are already >>> following a policy along the lines of trunk -> anything goes, not trunk -> >>> try not to change these interfaces. I would expect that to be the same >>> policy for any new internal interfaces that are added. But given we >>> already have many such interfaces, I see no reason to block adding more of >>> them while change policies are discussed. -Jeremiah > On Nov 9, 2021, at 10:44 AM, David Capwell >>> wrote: > > I still have one outstanding comment, but this is a comment for several >>> of the CEPs being worked on > >> And last comment, which I have also done in the other modularity >>> thread… backwards compatibility and maintenance. It is not clear right now >>> what java interfaces may not break and how we can maintain and extend such >>> interfaces in the future. If the goal is to allow 3rd parties to plugin >>> and offer new SSTable formats, are we as a project ok with having a minor >>> release do a binary or source non-compatible change? If not how do we >>> detect this? Until this problem is solved, I do not think we should add >>> any such interfaces. > > I would love some clarity on this. Specifically, if we assume a patch >>> author/reviewers are not familiar with the impact of changes these >>> interfaces, what happens? Do we have tools to block this? Do we require >>> 3rd party authors to create massive shims to deal with every patch level >>> version out there? I would love more clarity on how we maintain these new >>> pluggable interfaces. > >> On Nov 9, 2021, at 4:45 AM, Branimir Lambov >>> wrote: >> >> Does anyone have any further comments or questions on the proposal, or >>> are >> we ready to move forward to a vote? >> >> Regards, >> Branimir >> >> On Tue, Nov 2, 2021 at 7:15 PM David Capwell >>> >> wrote: >> >>>
Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056)
I would love to have this discussion and setup annotations or similar to formalize things. I just do not think we need to hold any up CEPs to do so. That discussion should possibly be a CEP of its own proposing how we want to formalize interfaces? I would be happy to go through and try to put together something for that or since you feel so strongly about it maybe you want to David? At the very least it should get its own DISCUSS thread and then be written up in the wiki. -Jeremiah > On Nov 9, 2021, at 1:06 PM, Joshua McKenzie wrote: > >> >> trunk -> anything goes, not trunk -> try not to change these interfaces > > Have we ever clarified what "these interfaces" are? Was just talking to > David and I realized I didn't even JavaDoc CommitLogReadHandler as _being > designed_ for external usage. /sigh > > I think it'd be valuable for us to go through the codebase and annotate > interfaces as intended to be exposed to 3rd parties; this has bothered me > for years. Especially as we come up on a large number of new cleanups, > refactorings, and potentially genericizing some subsystems into API's > (CEP-18 descendents). > > > On Tue, Nov 9, 2021 at 2:01 PM David Capwell > wrote: > >>> We already have many interfaces similar to these for Compaction >> Strategy, Indexing, Query Handler. >> >> Today-I-Learned QueryHandler is not allowed to be touched in a minor… good >> to know… >> >>> not trunk -> try not to change these interfaces >> >> Outside of MBeans, I honestly do not know what interfaces fall into this >> group; and for MBeans we have tests which block breaking changes. The >> point I am making is that not everyone is aware of the rules, so having >> something in place to help enforce such rules should be thought about; if >> we want to add pluggable hooks with the intent that external parties can >> leverage such hooks, we should also add to the scope the maintenance of >> these interfaces (we should not assume “tribal knowledge” will work). >> >> I am not trying to ask for something large or something requiring a ton of >> work, I am just asking that this gets thought about during the project so >> it doesn’t get neglected. This could be as simple as an annotation like >> @ExposedTo3rdParties (Hadoop does this to show an interface is exposed and >> must be maintained), or it could be something like split directories >> (src/java = private, src/java-exposed = public); I am trying not to dictate >> an implementation, only trying to make sure we are setup to support the CEP >> after the work is done. >> >> >>> On Nov 9, 2021, at 9:52 AM, Jeremiah D Jordan >> wrote: >>> >>> We already have many interfaces similar to these for Compaction >> Strategy, Indexing, Query Handler. I would hope that commiters are already >> following a policy along the lines of trunk -> anything goes, not trunk -> >> try not to change these interfaces. I would expect that to be the same >> policy for any new internal interfaces that are added. But given we >> already have many such interfaces, I see no reason to block adding more of >> them while change policies are discussed. >>> >>> -Jeremiah >>> On Nov 9, 2021, at 10:44 AM, David Capwell >> wrote: I still have one outstanding comment, but this is a comment for several >> of the CEPs being worked on > And last comment, which I have also done in the other modularity >> thread… backwards compatibility and maintenance. It is not clear right now >> what java interfaces may not break and how we can maintain and extend such >> interfaces in the future. If the goal is to allow 3rd parties to plugin >> and offer new SSTable formats, are we as a project ok with having a minor >> release do a binary or source non-compatible change? If not how do we >> detect this? Until this problem is solved, I do not think we should add >> any such interfaces. I would love some clarity on this. Specifically, if we assume a patch >> author/reviewers are not familiar with the impact of changes these >> interfaces, what happens? Do we have tools to block this? Do we require >> 3rd party authors to create massive shims to deal with every patch level >> version out there? I would love more clarity on how we maintain these new >> pluggable interfaces. > On Nov 9, 2021, at 4:45 AM, Branimir Lambov >> wrote: > > Does anyone have any further comments or questions on the proposal, or >> are > we ready to move forward to a vote? > > Regards, > Branimir > > On Tue, Nov 2, 2021 at 7:15 PM David Capwell >> > wrote: > >>> I apologize I did not mention those things explicitly. All the places >> where >>> sstable files are accessed directly would have to be refactored. >> >> Works for me >> >>> Speaking about the implementation, one idea I was thinking about was >> that >>> the factories for formats are registered using Java's native service >>> loader
Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056)
> > trunk -> anything goes, not trunk -> try not to change these interfaces Have we ever clarified what "these interfaces" are? Was just talking to David and I realized I didn't even JavaDoc CommitLogReadHandler as _being designed_ for external usage. /sigh I think it'd be valuable for us to go through the codebase and annotate interfaces as intended to be exposed to 3rd parties; this has bothered me for years. Especially as we come up on a large number of new cleanups, refactorings, and potentially genericizing some subsystems into API's (CEP-18 descendents). On Tue, Nov 9, 2021 at 2:01 PM David Capwell wrote: > > We already have many interfaces similar to these for Compaction > Strategy, Indexing, Query Handler. > > Today-I-Learned QueryHandler is not allowed to be touched in a minor… good > to know… > > > not trunk -> try not to change these interfaces > > Outside of MBeans, I honestly do not know what interfaces fall into this > group; and for MBeans we have tests which block breaking changes. The > point I am making is that not everyone is aware of the rules, so having > something in place to help enforce such rules should be thought about; if > we want to add pluggable hooks with the intent that external parties can > leverage such hooks, we should also add to the scope the maintenance of > these interfaces (we should not assume “tribal knowledge” will work). > > I am not trying to ask for something large or something requiring a ton of > work, I am just asking that this gets thought about during the project so > it doesn’t get neglected. This could be as simple as an annotation like > @ExposedTo3rdParties (Hadoop does this to show an interface is exposed and > must be maintained), or it could be something like split directories > (src/java = private, src/java-exposed = public); I am trying not to dictate > an implementation, only trying to make sure we are setup to support the CEP > after the work is done. > > > > On Nov 9, 2021, at 9:52 AM, Jeremiah D Jordan > wrote: > > > > We already have many interfaces similar to these for Compaction > Strategy, Indexing, Query Handler. I would hope that commiters are already > following a policy along the lines of trunk -> anything goes, not trunk -> > try not to change these interfaces. I would expect that to be the same > policy for any new internal interfaces that are added. But given we > already have many such interfaces, I see no reason to block adding more of > them while change policies are discussed. > > > > -Jeremiah > > > >> On Nov 9, 2021, at 10:44 AM, David Capwell > wrote: > >> > >> I still have one outstanding comment, but this is a comment for several > of the CEPs being worked on > >> > >>> And last comment, which I have also done in the other modularity > thread… backwards compatibility and maintenance. It is not clear right now > what java interfaces may not break and how we can maintain and extend such > interfaces in the future. If the goal is to allow 3rd parties to plugin > and offer new SSTable formats, are we as a project ok with having a minor > release do a binary or source non-compatible change? If not how do we > detect this? Until this problem is solved, I do not think we should add > any such interfaces. > >> > >> I would love some clarity on this. Specifically, if we assume a patch > author/reviewers are not familiar with the impact of changes these > interfaces, what happens? Do we have tools to block this? Do we require > 3rd party authors to create massive shims to deal with every patch level > version out there? I would love more clarity on how we maintain these new > pluggable interfaces. > >> > >>> On Nov 9, 2021, at 4:45 AM, Branimir Lambov > wrote: > >>> > >>> Does anyone have any further comments or questions on the proposal, or > are > >>> we ready to move forward to a vote? > >>> > >>> Regards, > >>> Branimir > >>> > >>> On Tue, Nov 2, 2021 at 7:15 PM David Capwell > > >>> wrote: > >>> > > I apologize I did not mention those things explicitly. All the places > where > > sstable files are accessed directly would have to be refactored. > > Works for me > > > Speaking about the implementation, one idea I was thinking about was > that > > the factories for formats are registered using Java's native service > > loader. > > I am a fan of ServiceLoader as a means of plugging in. > > > I hope this explains a bit > > Yep; thanks! > > > On Nov 2, 2021, at 1:46 AM, Jacek Lewandowski < > lewandowski.ja...@gmail.com> wrote: > > > > David, > > > > I apologize I did not mention those things explicitly. All the places > where > > sstable files are accessed directly would have to be refactored. > > > > Regarding TableMetrics - currently it includes many metrics, some of > them > > are unrelated to sstables at all, but there are metrics which are > specific > > to the current sstable for
Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056)
> We already have many interfaces similar to these for Compaction Strategy, > Indexing, Query Handler. Today-I-Learned QueryHandler is not allowed to be touched in a minor… good to know… > not trunk -> try not to change these interfaces Outside of MBeans, I honestly do not know what interfaces fall into this group; and for MBeans we have tests which block breaking changes. The point I am making is that not everyone is aware of the rules, so having something in place to help enforce such rules should be thought about; if we want to add pluggable hooks with the intent that external parties can leverage such hooks, we should also add to the scope the maintenance of these interfaces (we should not assume “tribal knowledge” will work). I am not trying to ask for something large or something requiring a ton of work, I am just asking that this gets thought about during the project so it doesn’t get neglected. This could be as simple as an annotation like @ExposedTo3rdParties (Hadoop does this to show an interface is exposed and must be maintained), or it could be something like split directories (src/java = private, src/java-exposed = public); I am trying not to dictate an implementation, only trying to make sure we are setup to support the CEP after the work is done. > On Nov 9, 2021, at 9:52 AM, Jeremiah D Jordan > wrote: > > We already have many interfaces similar to these for Compaction Strategy, > Indexing, Query Handler. I would hope that commiters are already following a > policy along the lines of trunk -> anything goes, not trunk -> try not to > change these interfaces. I would expect that to be the same policy for any > new internal interfaces that are added. But given we already have many such > interfaces, I see no reason to block adding more of them while change > policies are discussed. > > -Jeremiah > >> On Nov 9, 2021, at 10:44 AM, David Capwell >> wrote: >> >> I still have one outstanding comment, but this is a comment for several of >> the CEPs being worked on >> >>> And last comment, which I have also done in the other modularity thread… >>> backwards compatibility and maintenance. It is not clear right now what >>> java interfaces may not break and how we can maintain and extend such >>> interfaces in the future. If the goal is to allow 3rd parties to plugin >>> and offer new SSTable formats, are we as a project ok with having a minor >>> release do a binary or source non-compatible change? If not how do we >>> detect this? Until this problem is solved, I do not think we should add >>> any such interfaces. >> >> I would love some clarity on this. Specifically, if we assume a patch >> author/reviewers are not familiar with the impact of changes these >> interfaces, what happens? Do we have tools to block this? Do we require 3rd >> party authors to create massive shims to deal with every patch level version >> out there? I would love more clarity on how we maintain these new pluggable >> interfaces. >> >>> On Nov 9, 2021, at 4:45 AM, Branimir Lambov wrote: >>> >>> Does anyone have any further comments or questions on the proposal, or are >>> we ready to move forward to a vote? >>> >>> Regards, >>> Branimir >>> >>> On Tue, Nov 2, 2021 at 7:15 PM David Capwell >>> wrote: >>> > I apologize I did not mention those things explicitly. All the places where > sstable files are accessed directly would have to be refactored. Works for me > Speaking about the implementation, one idea I was thinking about was that > the factories for formats are registered using Java's native service > loader. I am a fan of ServiceLoader as a means of plugging in. > I hope this explains a bit Yep; thanks! > On Nov 2, 2021, at 1:46 AM, Jacek Lewandowski < lewandowski.ja...@gmail.com> wrote: > > David, > > I apologize I did not mention those things explicitly. All the places where > sstable files are accessed directly would have to be refactored. > > Regarding TableMetrics - currently it includes many metrics, some of them > are unrelated to sstables at all, but there are metrics which are specific > to the current sstable format, like metrics related to index summaries or > bloom filters. The created gauges query certain methods on sstable reader - > I think the only common metrics for sstables we can leave in TableMetrics > are those for which there are query methods in generic sstable interface. > Other metrics, specific to the certain sstable format should be registered > by the implementation itself. > > Speaking about the implementation, one idea I was thinking about was that > the factories for formats are registered using Java's native service > loader. This way we could get the list of all the factories on the > classpath and call some method, like `registerMetric
Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056)
We already have many interfaces similar to these for Compaction Strategy, Indexing, Query Handler. I would hope that commiters are already following a policy along the lines of trunk -> anything goes, not trunk -> try not to change these interfaces. I would expect that to be the same policy for any new internal interfaces that are added. But given we already have many such interfaces, I see no reason to block adding more of them while change policies are discussed. -Jeremiah > On Nov 9, 2021, at 10:44 AM, David Capwell wrote: > > I still have one outstanding comment, but this is a comment for several of > the CEPs being worked on > >> And last comment, which I have also done in the other modularity thread… >> backwards compatibility and maintenance. It is not clear right now what java >> interfaces may not break and how we can maintain and extend such interfaces >> in the future. If the goal is to allow 3rd parties to plugin and offer new >> SSTable formats, are we as a project ok with having a minor release do a >> binary or source non-compatible change? If not how do we detect this? >> Until this problem is solved, I do not think we should add any such >> interfaces. > > I would love some clarity on this. Specifically, if we assume a patch > author/reviewers are not familiar with the impact of changes these > interfaces, what happens? Do we have tools to block this? Do we require 3rd > party authors to create massive shims to deal with every patch level version > out there? I would love more clarity on how we maintain these new pluggable > interfaces. > >> On Nov 9, 2021, at 4:45 AM, Branimir Lambov wrote: >> >> Does anyone have any further comments or questions on the proposal, or are >> we ready to move forward to a vote? >> >> Regards, >> Branimir >> >> On Tue, Nov 2, 2021 at 7:15 PM David Capwell >> wrote: >> I apologize I did not mention those things explicitly. All the places >>> where sstable files are accessed directly would have to be refactored. >>> >>> Works for me >>> Speaking about the implementation, one idea I was thinking about was that the factories for formats are registered using Java's native service loader. >>> >>> I am a fan of ServiceLoader as a means of plugging in. >>> I hope this explains a bit >>> >>> Yep; thanks! >>> On Nov 2, 2021, at 1:46 AM, Jacek Lewandowski < >>> lewandowski.ja...@gmail.com> wrote: David, I apologize I did not mention those things explicitly. All the places >>> where sstable files are accessed directly would have to be refactored. Regarding TableMetrics - currently it includes many metrics, some of them are unrelated to sstables at all, but there are metrics which are >>> specific to the current sstable format, like metrics related to index summaries or bloom filters. The created gauges query certain methods on sstable >>> reader - I think the only common metrics for sstables we can leave in TableMetrics are those for which there are query methods in generic sstable interface. Other metrics, specific to the certain sstable format should be >>> registered by the implementation itself. Speaking about the implementation, one idea I was thinking about was that the factories for formats are registered using Java's native service loader. This way we could get the list of all the factories on the classpath and call some method, like `registerMetrics` during system initialization. That could be also implemented in static initializer in >>> the factory but it would make it less obvious for the implementors where such initialization should be done. I hope this explains a bit Thanks, Jacek >>> >>> >>> - >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >>> For additional commands, e-mail: dev-h...@cassandra.apache.org >>> >>> > > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056)
I still have one outstanding comment, but this is a comment for several of the CEPs being worked on > And last comment, which I have also done in the other modularity thread… > backwards compatibility and maintenance. It is not clear right now what java > interfaces may not break and how we can maintain and extend such interfaces > in the future. If the goal is to allow 3rd parties to plugin and offer new > SSTable formats, are we as a project ok with having a minor release do a > binary or source non-compatible change? If not how do we detect this? Until > this problem is solved, I do not think we should add any such interfaces. I would love some clarity on this. Specifically, if we assume a patch author/reviewers are not familiar with the impact of changes these interfaces, what happens? Do we have tools to block this? Do we require 3rd party authors to create massive shims to deal with every patch level version out there? I would love more clarity on how we maintain these new pluggable interfaces. > On Nov 9, 2021, at 4:45 AM, Branimir Lambov wrote: > > Does anyone have any further comments or questions on the proposal, or are > we ready to move forward to a vote? > > Regards, > Branimir > > On Tue, Nov 2, 2021 at 7:15 PM David Capwell > wrote: > >>> I apologize I did not mention those things explicitly. All the places >> where >>> sstable files are accessed directly would have to be refactored. >> >> Works for me >> >>> Speaking about the implementation, one idea I was thinking about was that >>> the factories for formats are registered using Java's native service >>> loader. >> >> I am a fan of ServiceLoader as a means of plugging in. >> >>> I hope this explains a bit >> >> Yep; thanks! >> >>> On Nov 2, 2021, at 1:46 AM, Jacek Lewandowski < >> lewandowski.ja...@gmail.com> wrote: >>> >>> David, >>> >>> I apologize I did not mention those things explicitly. All the places >> where >>> sstable files are accessed directly would have to be refactored. >>> >>> Regarding TableMetrics - currently it includes many metrics, some of them >>> are unrelated to sstables at all, but there are metrics which are >> specific >>> to the current sstable format, like metrics related to index summaries or >>> bloom filters. The created gauges query certain methods on sstable >> reader - >>> I think the only common metrics for sstables we can leave in TableMetrics >>> are those for which there are query methods in generic sstable interface. >>> Other metrics, specific to the certain sstable format should be >> registered >>> by the implementation itself. >>> >>> Speaking about the implementation, one idea I was thinking about was that >>> the factories for formats are registered using Java's native service >>> loader. This way we could get the list of all the factories on the >>> classpath and call some method, like `registerMetrics` during system >>> initialization. That could be also implemented in static initializer in >> the >>> factory but it would make it less obvious for the implementors where such >>> initialization should be done. >>> >>> I hope this explains a bit >>> >>> Thanks, >>> Jacek >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: dev-h...@cassandra.apache.org >> >> - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: [DISCUSS] Creating a new slack channel for newcomers
I also feel that having all the resources to get help in more or less one place (#cassandra-dev slack / ML) probably helps newcomers on the whole since they can ask questions and likely engage with someone who can help. I know that I've asked a few silly questions in #cassandra-dev and appreciated that there were more experienced project members to help answer them. If we wanted to have a set of designated "newcomer mentors" or some such that seems useful in addition. Perhaps their email/handles on the website in the contributing section with an encouragement to ask them first if you're unsure who to ask? -Joey On Tue, Nov 9, 2021 at 10:16 AM Sumanth Pasupuleti wrote: > > +1 that existing channels of communication (cassandra-dev slack and mailing > lists) should ideally suffice, and I have not seen prohibitive > communication in those forums thus far that goes against newcomers. I agree > it can be intimidating, but to Bowen's point, the more traffic we see > around newcomers in those forums, the more comfortable it gets. > I agree starting a new channel is a low effort experiment we can do, but > the success depends on finding mentors and the engagement of mentors vs I > believe engagement in #cassandra-dev is almost guaranteed given the high > number of people in the channel. > > Thanks, > Sumanth > > On Tue, Nov 9, 2021 at 6:47 AM Bowen Song wrote: > > > As a newcomer (made two commits since October) who has been watching > > this mailing list since then, I don't like the idea of a separate > > channel for beginner questions. The volume in this mailing list is > > fairly low, I can't see any legitimate reason for diverting a portion of > > that into another channel, further reducing the volume in the existing > > channel and perhaps not creating much volume in the new channel either. > > > > Personally, I think a clearly written and easy to find community > > guideline highlighting that this mailing list is suitable for beginner > > questions, and give some suggestions/recommendations on when, where and > > how to ask beginner questions would be more useful. > > > > At the moment because the volume of beginner questions is very very low > > in this mailing list, newcomers like me don't feel comfortable asking > > questions here. That's not because there's 600 pair of eyes watching > > this (TBH, if you didn't mention it, I wouldn't have noticed it), but > > because the herd mentality. If not many questions are asked here, most > > people won't start doing that. It's all about creating the environment > > that makes people feel comfortable asking questions here. > > > > On 08/11/2021 16:28, Benjamin Lerer wrote: > > > Hi everybody, > > > > > > Aleksei Zotov mentioned to me that it was a bit intimidating for > > newcomers > > > to ask beginner questions in the cassandra-dev channel as it has over 600 > > > followers and that we should probably have a specific channel for > > > newcomers. > > > This proposal makes total sense to me. > > > > > > What is your opinion on this? Do you have any concerns about it? > > > > > > Benjamin > > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: [DISCUSS] Creating a new slack channel for newcomers
> > not because there's 600 pair of eyes watching > this (TBH, if you didn't mention it, I wouldn't have noticed it) oof; that wasn't my intent at all! :) I think a clearly written and easy to find community > guideline highlighting that this mailing list is suitable for beginner > questions, and give some suggestions/recommendations So maybe fleshing out the "How to Contribute" section of the website a bit here: https://cassandra.apache.org/_/community.html#how-to-contribute? Could maybe also link to the JIRA filters for test failures and starter tickets for an appropriate workload to get started with. On Tue, Nov 9, 2021 at 10:16 AM Sumanth Pasupuleti < sumanth.pasupuleti...@gmail.com> wrote: > +1 that existing channels of communication (cassandra-dev slack and mailing > lists) should ideally suffice, and I have not seen prohibitive > communication in those forums thus far that goes against newcomers. I agree > it can be intimidating, but to Bowen's point, the more traffic we see > around newcomers in those forums, the more comfortable it gets. > I agree starting a new channel is a low effort experiment we can do, but > the success depends on finding mentors and the engagement of mentors vs I > believe engagement in #cassandra-dev is almost guaranteed given the high > number of people in the channel. > > Thanks, > Sumanth > > On Tue, Nov 9, 2021 at 6:47 AM Bowen Song wrote: > > > As a newcomer (made two commits since October) who has been watching > > this mailing list since then, I don't like the idea of a separate > > channel for beginner questions. The volume in this mailing list is > > fairly low, I can't see any legitimate reason for diverting a portion of > > that into another channel, further reducing the volume in the existing > > channel and perhaps not creating much volume in the new channel either. > > > > Personally, I think a clearly written and easy to find community > > guideline highlighting that this mailing list is suitable for beginner > > questions, and give some suggestions/recommendations on when, where and > > how to ask beginner questions would be more useful. > > > > At the moment because the volume of beginner questions is very very low > > in this mailing list, newcomers like me don't feel comfortable asking > > questions here. That's not because there's 600 pair of eyes watching > > this (TBH, if you didn't mention it, I wouldn't have noticed it), but > > because the herd mentality. If not many questions are asked here, most > > people won't start doing that. It's all about creating the environment > > that makes people feel comfortable asking questions here. > > > > On 08/11/2021 16:28, Benjamin Lerer wrote: > > > Hi everybody, > > > > > > Aleksei Zotov mentioned to me that it was a bit intimidating for > > newcomers > > > to ask beginner questions in the cassandra-dev channel as it has over > 600 > > > followers and that we should probably have a specific channel for > > > newcomers. > > > This proposal makes total sense to me. > > > > > > What is your opinion on this? Do you have any concerns about it? > > > > > > Benjamin > > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > >
Re: [DISCUSS] Creating a new slack channel for newcomers
+1 that existing channels of communication (cassandra-dev slack and mailing lists) should ideally suffice, and I have not seen prohibitive communication in those forums thus far that goes against newcomers. I agree it can be intimidating, but to Bowen's point, the more traffic we see around newcomers in those forums, the more comfortable it gets. I agree starting a new channel is a low effort experiment we can do, but the success depends on finding mentors and the engagement of mentors vs I believe engagement in #cassandra-dev is almost guaranteed given the high number of people in the channel. Thanks, Sumanth On Tue, Nov 9, 2021 at 6:47 AM Bowen Song wrote: > As a newcomer (made two commits since October) who has been watching > this mailing list since then, I don't like the idea of a separate > channel for beginner questions. The volume in this mailing list is > fairly low, I can't see any legitimate reason for diverting a portion of > that into another channel, further reducing the volume in the existing > channel and perhaps not creating much volume in the new channel either. > > Personally, I think a clearly written and easy to find community > guideline highlighting that this mailing list is suitable for beginner > questions, and give some suggestions/recommendations on when, where and > how to ask beginner questions would be more useful. > > At the moment because the volume of beginner questions is very very low > in this mailing list, newcomers like me don't feel comfortable asking > questions here. That's not because there's 600 pair of eyes watching > this (TBH, if you didn't mention it, I wouldn't have noticed it), but > because the herd mentality. If not many questions are asked here, most > people won't start doing that. It's all about creating the environment > that makes people feel comfortable asking questions here. > > On 08/11/2021 16:28, Benjamin Lerer wrote: > > Hi everybody, > > > > Aleksei Zotov mentioned to me that it was a bit intimidating for > newcomers > > to ask beginner questions in the cassandra-dev channel as it has over 600 > > followers and that we should probably have a specific channel for > > newcomers. > > This proposal makes total sense to me. > > > > What is your opinion on this? Do you have any concerns about it? > > > > Benjamin > > > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >
Re: [DISCUSS] Creating a new slack channel for newcomers
As a newcomer (made two commits since October) who has been watching this mailing list since then, I don't like the idea of a separate channel for beginner questions. The volume in this mailing list is fairly low, I can't see any legitimate reason for diverting a portion of that into another channel, further reducing the volume in the existing channel and perhaps not creating much volume in the new channel either. Personally, I think a clearly written and easy to find community guideline highlighting that this mailing list is suitable for beginner questions, and give some suggestions/recommendations on when, where and how to ask beginner questions would be more useful. At the moment because the volume of beginner questions is very very low in this mailing list, newcomers like me don't feel comfortable asking questions here. That's not because there's 600 pair of eyes watching this (TBH, if you didn't mention it, I wouldn't have noticed it), but because the herd mentality. If not many questions are asked here, most people won't start doing that. It's all about creating the environment that makes people feel comfortable asking questions here. On 08/11/2021 16:28, Benjamin Lerer wrote: Hi everybody, Aleksei Zotov mentioned to me that it was a bit intimidating for newcomers to ask beginner questions in the cassandra-dev channel as it has over 600 followers and that we should probably have a specific channel for newcomers. This proposal makes total sense to me. What is your opinion on this? Do you have any concerns about it? Benjamin - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: [DISCUSS] CEP-17: SSTable format API (CASSANDRA-17056)
Does anyone have any further comments or questions on the proposal, or are we ready to move forward to a vote? Regards, Branimir On Tue, Nov 2, 2021 at 7:15 PM David Capwell wrote: > > I apologize I did not mention those things explicitly. All the places > where > > sstable files are accessed directly would have to be refactored. > > Works for me > > > Speaking about the implementation, one idea I was thinking about was that > > the factories for formats are registered using Java's native service > > loader. > > I am a fan of ServiceLoader as a means of plugging in. > > > I hope this explains a bit > > Yep; thanks! > > > On Nov 2, 2021, at 1:46 AM, Jacek Lewandowski < > lewandowski.ja...@gmail.com> wrote: > > > > David, > > > > I apologize I did not mention those things explicitly. All the places > where > > sstable files are accessed directly would have to be refactored. > > > > Regarding TableMetrics - currently it includes many metrics, some of them > > are unrelated to sstables at all, but there are metrics which are > specific > > to the current sstable format, like metrics related to index summaries or > > bloom filters. The created gauges query certain methods on sstable > reader - > > I think the only common metrics for sstables we can leave in TableMetrics > > are those for which there are query methods in generic sstable interface. > > Other metrics, specific to the certain sstable format should be > registered > > by the implementation itself. > > > > Speaking about the implementation, one idea I was thinking about was that > > the factories for formats are registered using Java's native service > > loader. This way we could get the list of all the factories on the > > classpath and call some method, like `registerMetrics` during system > > initialization. That could be also implemented in static initializer in > the > > factory but it would make it less obvious for the implementors where such > > initialization should be done. > > > > I hope this explains a bit > > > > Thanks, > > Jacek > > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >
[RESULT][VOTE] Release dtest-api 0.0.11
> > The vote will be open for 24 hours. Everyone who has tested the build > is invited to vote. Votes by PMC members are considered binding. A > vote passes if there are at least three binding +1s. > The vote passes, eventually, with six +1s (four binding), and no vetos. Artifacts have been published.
Re: [VOTE] Release dtest-api 0.0.11
+1 On Fri, 29 Oct 2021 at 23:44, Mick Semb Wever wrote: > Proposing the test build of in-jvm dtest API 0.0.11 for release. > > Repository: > > https://gitbox.apache.org/repos/asf?p=cassandra-in-jvm-dtest-api.git;a=shortlog;h=refs/tags/0.0.11 > > Candidate SHA: > > https://github.com/apache/cassandra-in-jvm-dtest-api/commit/cbe7e89dc166cf4f2f94a11c7b3e867494f62ac017050 > > tagged with 0.0.11 > > Artifacts: > > https://repository.apache.org/content/repositories/orgapachecassandra-1250/org/apache/cassandra/dtest-api/0.0.11/ > > Key signature: A4C465FEA0C552561A392A61E91335D77E3E87CB > > Changes since last release: > * CASSANDRA-17064: Option to start nodes with blank gossip state > * CASSANDRA-17050: Fix Upgrade tests throwing > UnsupportedOperationException on initialise > > > The vote will be open for 24 hours. Everyone who has tested the build > is invited to vote. Votes by PMC members are considered binding. A > vote passes if there are at least three binding +1s. >