Regarding deprecation, while I support the deprecation and removal from the Cassandra codebase, I do think we should communicate that with the wider community (user thread?) so people aren't surprised - especially since it's already four months after the 4.1.0 release. That would hopefully also encourage those interested in continuing support to extract it out into a separate library.
> On Mar 16, 2023, at 4:19 PM, Miklosovic, Stefan > <stefan.mikloso...@netapp.com> wrote: > > I think we already decided it in this thread. > > I was specifically asking this question: > > Deprecation would mean that the code has to be there whole 5.0 so we can > remove it for real in 6.0? > > To which the response was: > > I think if we reach consensus here that decides it. I too vote to > deprecate in 4.1.x. This means we would remove it in 5.0. > > Then bunch of +1s followed and agreed with that explicitly. > > I do not plan to maintain nor extract that, personally. > > ________________________________________ > From: David Capwell <dcapw...@apple.com <mailto:dcapw...@apple.com>> > Sent: Thursday, March 16, 2023 22:13 > To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org> > Subject: Re: Role of Hadoop code in Cassandra 5.0 > > NetApp Security WARNING: This is an external email. Do not click links or > open attachments unless you recognize the sender and know the content is safe. > > > > Isn’t our deprecation rules that if we deprecate in 4.0.0 we can remove in > 5.x, but 4.x needs to wait for 6.x? I am cool deprecating this and willing > to pull into another repo if people (not me) are willing to maintain it (else > just delete). > > On Mar 10, 2023, at 1:13 AM, Jacek Lewandowski <lewandowski.ja...@gmail.com> > wrote: > > I've experimentally added > https://issues.apache.org/jira/browse/CASSANDRA-16984 to > https://issues.apache.org/jira/browse/CASSANDRA-18306 (post 4.0 cleanup) > > - - -- --- ----- -------- ------------- > Jacek Lewandowski > > > pt., 10 mar 2023 o 09:56 Berenguer Blasi <berenguerbl...@gmail.com > <mailto:berenguerbl...@gmail.com><mailto:berenguerbl...@gmail.com>> > napisał(a): > > +1 deprecate + removal > > On 10/3/23 1:41, Jeremy Hanna wrote: > It was mainly to integrate with Hadoop - I used it from 0.6 to 1.2 in > production prior to starting at DataStax and at that time I was stitching > together Cloudera's distribution of Hadoop with Cassandra. Back then there > were others that used it as well. As far as I know, usage dropped off when > the Spark Cassandra Connector got pretty mature. It enabled people to take > an off the shelf Hadoop distribution and run the Hadoop processes on the same > nodes or external to the Cassandra cluster and get topology information to do > things like Hadoop splits and things like that through the Hadoop interfaces. > I think the version lag is an indication that it hasn't been used recently. > Also, like others have said, the Spark Cassandra Connector is really what > people should be using at this point imo. That or depending on the use case, > Apple's bulk reader: https://github.com/jberragan/spark-cassandra-bulkreader > that is mentioned on https://issues.apache.org/jira/browse/CASSANDRA-16222. > > On Mar 9, 2023, at 12:00 PM, Rahul Xavier Singh <rahul.xavier.si...@gmail.com > <mailto:rahul.xavier.si...@gmail.com>><mailto:rahul.xavier.si...@gmail.com> > wrote: > > What is the hadoop code for? For interacting from Hadoop via CQL, or Thrift > if it's that old, or directly looking at SSTables? Been using C* since 2 and > have never used it. > > Agree to deprecate in next possible 4.1.x version and remove in 5.0 > > Rahul Singh > Chief Executive Officer | Business Platform Architect m: 202.905.2818 e: > rahul.si...@anant.us > <mailto:rahul.si...@anant.us><mailto:rahul.si...@anant.us> li: > http://linkedin.com/in/xingh ca: http://calendly.com/xingh > > We create, support, and manage real-time global data & analytics platforms > for the modern enterprise. > > Anant | https://anant.us<https://anant.us/> > 3 Washington Circle, Suite 301 > Washington, D.C. 20037 > > http://Cassandra.Link<http://cassandra.link/> > <http://cassandra.link<http://cassandra.link/>> : The best resources for > Apache Cassandra > > > On Thu, Mar 9, 2023 at 12:53 PM Brandon Williams <dri...@gmail.com > <mailto:dri...@gmail.com><mailto:dri...@gmail.com>> wrote: > I think if we reach consensus here that decides it. I too vote to > deprecate in 4.1.x. This means we would remove it in 5.0. > > Kind Regards, > Brandon > > On Thu, Mar 9, 2023 at 11:32 AM Ekaterina Dimitrova > <e.dimitr...@gmail.com > <mailto:e.dimitr...@gmail.com><mailto:e.dimitr...@gmail.com>> wrote: >> >> Deprecation sounds good to me, but I am not completely sure in which version >> we can do it. If it is possible to add a deprecation warning in the 4.x >> series or at least 4.1.x - I vote for that. >> >> On Thu, 9 Mar 2023 at 12:14, Jacek Lewandowski <lewandowski.ja...@gmail.com >> <mailto:lewandowski.ja...@gmail.com><mailto:lewandowski.ja...@gmail.com>> >> wrote: >>> >>> Is it possible to deprecate it in the 4.1.x patch release? :) >>> >>> >>> - - -- --- ----- -------- ------------- >>> Jacek Lewandowski >>> >>> >>> czw., 9 mar 2023 o 18:11 Brandon Williams <dri...@gmail.com >>> <mailto:dri...@gmail.com><mailto:dri...@gmail.com>> napisał(a): >>>> >>>> This is my feeling too, but I think we should accomplish this by >>>> deprecating it first. I don't expect anything will change after the >>>> deprecation period. >>>> >>>> Kind Regards, >>>> Brandon >>>> >>>> On Thu, Mar 9, 2023 at 11:09 AM Jacek Lewandowski >>>> <lewandowski.ja...@gmail.com >>>> <mailto:lewandowski.ja...@gmail.com><mailto:lewandowski.ja...@gmail.com>> >>>> wrote: >>>>> >>>>> I vote for removing it entirely. >>>>> >>>>> thanks >>>>> - - -- --- ----- -------- ------------- >>>>> Jacek Lewandowski >>>>> >>>>> >>>>> czw., 9 mar 2023 o 18:07 Miklosovic, Stefan <stefan.mikloso...@netapp.com >>>>> <mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com>> >>>>> napisał(a): >>>>>> >>>>>> Derek, >>>>>> >>>>>> I have couple more points ... I do not think that extracting it to a >>>>>> separate repository is "win". That code is on Hadoop 1.0.3. We would be >>>>>> spending a lot of work on extracting it just to extract 10 years old >>>>>> code with occasional updates (in my humble opinion just to make it >>>>>> compilable again if the code around changes). What good is in that? We >>>>>> would have one more place to take care of ... Now we at least have it >>>>>> all in one place. >>>>>> >>>>>> I believe we have four options: >>>>>> >>>>>> 1) leave it there so it will be like this is for next years with >>>>>> questionable and diminishing usage >>>>>> 2) update it to Hadoop 3.3 (I wonder who is going to do that) >>>>>> 3) 2) and extract it to a separate repository but if we do 2) we can >>>>>> just leave it there >>>>>> 4) remove it >>>>>> >>>>>> ________________________________________ >>>>>> From: Derek Chen-Becker <de...@chen-becker.org >>>>>> <mailto:de...@chen-becker.org><mailto:de...@chen-becker.org>> >>>>>> Sent: Thursday, March 9, 2023 15:55 >>>>>> To: dev@cassandra.apache.org >>>>>> <mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org> >>>>>> Subject: Re: Role of Hadoop code in Cassandra 5.0 >>>>>> >>>>>> NetApp Security WARNING: This is an external email. Do not click links >>>>>> or open attachments unless you recognize the sender and know the content >>>>>> is safe. >>>>>> >>>>>> >>>>>> >>>>>> I think the question isn't "Who ... is still using that?" but more "are >>>>>> we actually going to support it?" If we're on a version that old it >>>>>> would appear that we've basically abandoned it, although there do appear >>>>>> to have been refactoring (for other things) commits in the last couple >>>>>> of years. I would be in favor of removal from 5.0, but at the very >>>>>> least, could it be moved into a separate repo/package so that it's not >>>>>> pulling a relatively large dependency subtree from Hadoop into our main >>>>>> codebase? >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Derek >>>>>> >>>>>> On Thu, Mar 9, 2023 at 6:44 AM Miklosovic, Stefan >>>>>> <stefan.mikloso...@netapp.com >>>>>> <mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>> >>>>>> wrote: >>>>>> Hi list, >>>>>> >>>>>> I stumbled upon Hadoop package again. I think there was some discussion >>>>>> about the relevancy of Hadoop code some time ago but I would like to ask >>>>>> this again. >>>>>> >>>>>> Do you think Hadoop code (1) is still relevant in 5.0? Who in the >>>>>> industry is still using that? >>>>>> >>>>>> We might drop a lot of code and some Hadoop dependencies too (3) (even >>>>>> their scope is "provided"). The version of Hadoop we build upon is 1.0.3 >>>>>> which was released 10 years ago. This code does not have any tests nor >>>>>> documentation on the website. >>>>>> >>>>>> There seems to be issues like this (2) and it seems like the solution is >>>>>> to, basically, use Spark Cassandra connector instead which I would say >>>>>> is quite reasonable. >>>>>> >>>>>> Regards >>>>>> >>>>>> (1) >>>>>> https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/hadoop >>>>>> (2) https://lists.apache.org/thread/jdy5hdc2l7l29h04dqol5ylroqos1y2p >>>>>> (3) >>>>>> https://github.com/apache/cassandra/blob/trunk/.build/parent-pom-template.xml#L507-L589 >>>>>> >>>>>> >>>>>> -- >>>>>> +---------------------------------------------------------------+ >>>>>> | Derek Chen-Becker | >>>>>> | GPG Key available at https://keybase.io/dchenbecker and | >>>>>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org | >>>>>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | >>>>>> +---------------------------------------------------------------+