Regarding deprecation, while I support the deprecation and removal from the 
Cassandra codebase, I do think we should communicate that with the wider 
community (user thread?) so people aren't surprised - especially since it's 
already four months after the 4.1.0 release.  That would hopefully also 
encourage those interested in continuing support to extract it out into a 
separate library.

> On Mar 16, 2023, at 4:19 PM, Miklosovic, Stefan 
> <stefan.mikloso...@netapp.com> wrote:
> 
> I think we already decided it in this thread.
> 
> I was specifically asking this question:
> 
> Deprecation would mean that the code has to be there whole 5.0 so we can 
> remove it for real in 6.0?
> 
> To which the response was:
> 
> I think if we reach consensus here that decides it. I too vote to
> deprecate in 4.1.x.  This means we would remove it in 5.0.
> 
> Then bunch of +1s followed and agreed with that explicitly.
> 
> I do not plan to maintain nor extract that, personally.
> 
> ________________________________________
> From: David Capwell <dcapw...@apple.com <mailto:dcapw...@apple.com>>
> Sent: Thursday, March 16, 2023 22:13
> To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>
> Subject: Re: Role of Hadoop code in Cassandra 5.0
> 
> NetApp Security WARNING: This is an external email. Do not click links or 
> open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> Isn’t our deprecation rules that if we deprecate in 4.0.0 we can remove in 
> 5.x, but 4.x needs to wait for 6.x?  I am cool deprecating this and willing 
> to pull into another repo if people (not me) are willing to maintain it (else 
> just delete).
> 
> On Mar 10, 2023, at 1:13 AM, Jacek Lewandowski <lewandowski.ja...@gmail.com> 
> wrote:
> 
> I've experimentally added 
> https://issues.apache.org/jira/browse/CASSANDRA-16984 to 
> https://issues.apache.org/jira/browse/CASSANDRA-18306 (post 4.0 cleanup)
> 
> - - -- --- ----- -------- -------------
> Jacek Lewandowski
> 
> 
> pt., 10 mar 2023 o 09:56 Berenguer Blasi <berenguerbl...@gmail.com 
> <mailto:berenguerbl...@gmail.com><mailto:berenguerbl...@gmail.com>> 
> napisał(a):
> 
> +1 deprecate + removal
> 
> On 10/3/23 1:41, Jeremy Hanna wrote:
> It was mainly to integrate with Hadoop - I used it from 0.6 to 1.2 in 
> production prior to starting at DataStax and at that time I was stitching 
> together Cloudera's distribution of Hadoop with Cassandra.  Back then there 
> were others that used it as well.  As far as I know, usage dropped off when 
> the Spark Cassandra Connector got pretty mature.  It enabled people to take 
> an off the shelf Hadoop distribution and run the Hadoop processes on the same 
> nodes or external to the Cassandra cluster and get topology information to do 
> things like Hadoop splits and things like that through the Hadoop interfaces. 
>  I think the version lag is an indication that it hasn't been used recently.  
> Also, like others have said, the Spark Cassandra Connector is really what 
> people should be using at this point imo.  That or depending on the use case, 
> Apple's bulk reader: https://github.com/jberragan/spark-cassandra-bulkreader 
> that is mentioned on https://issues.apache.org/jira/browse/CASSANDRA-16222.
> 
> On Mar 9, 2023, at 12:00 PM, Rahul Xavier Singh <rahul.xavier.si...@gmail.com 
> <mailto:rahul.xavier.si...@gmail.com>><mailto:rahul.xavier.si...@gmail.com> 
> wrote:
> 
> What is the hadoop code for? For interacting from Hadoop via CQL, or Thrift 
> if it's that old, or directly looking at SSTables? Been using C* since 2 and 
> have never used it.
> 
> Agree to deprecate in next possible 4.1.x version and remove in 5.0
> 
> Rahul Singh
> Chief Executive Officer | Business Platform Architect m: 202.905.2818 e: 
> rahul.si...@anant.us 
> <mailto:rahul.si...@anant.us><mailto:rahul.si...@anant.us> li: 
> http://linkedin.com/in/xingh ca: http://calendly.com/xingh
> 
> We create, support, and manage real-time global data & analytics platforms 
> for the modern enterprise.
> 
> Anant | https://anant.us<https://anant.us/>
> 3 Washington Circle, Suite 301
> Washington, D.C. 20037
> 
> http://Cassandra.Link<http://cassandra.link/> 
> <http://cassandra.link<http://cassandra.link/>> : The best resources for 
> Apache Cassandra
> 
> 
> On Thu, Mar 9, 2023 at 12:53 PM Brandon Williams <dri...@gmail.com 
> <mailto:dri...@gmail.com><mailto:dri...@gmail.com>> wrote:
> I think if we reach consensus here that decides it. I too vote to
> deprecate in 4.1.x.  This means we would remove it in 5.0.
> 
> Kind Regards,
> Brandon
> 
> On Thu, Mar 9, 2023 at 11:32 AM Ekaterina Dimitrova
> <e.dimitr...@gmail.com 
> <mailto:e.dimitr...@gmail.com><mailto:e.dimitr...@gmail.com>> wrote:
>> 
>> Deprecation sounds good to me, but I am not completely sure in which version 
>> we can do it. If it is possible to add a deprecation warning in the 4.x 
>> series or at least 4.1.x - I vote for that.
>> 
>> On Thu, 9 Mar 2023 at 12:14, Jacek Lewandowski <lewandowski.ja...@gmail.com 
>> <mailto:lewandowski.ja...@gmail.com><mailto:lewandowski.ja...@gmail.com>> 
>> wrote:
>>> 
>>> Is it possible to deprecate it in the 4.1.x patch release? :)
>>> 
>>> 
>>> - - -- --- ----- -------- -------------
>>> Jacek Lewandowski
>>> 
>>> 
>>> czw., 9 mar 2023 o 18:11 Brandon Williams <dri...@gmail.com 
>>> <mailto:dri...@gmail.com><mailto:dri...@gmail.com>> napisał(a):
>>>> 
>>>> This is my feeling too, but I think we should accomplish this by
>>>> deprecating it first.  I don't expect anything will change after the
>>>> deprecation period.
>>>> 
>>>> Kind Regards,
>>>> Brandon
>>>> 
>>>> On Thu, Mar 9, 2023 at 11:09 AM Jacek Lewandowski
>>>> <lewandowski.ja...@gmail.com 
>>>> <mailto:lewandowski.ja...@gmail.com><mailto:lewandowski.ja...@gmail.com>> 
>>>> wrote:
>>>>> 
>>>>> I vote for removing it entirely.
>>>>> 
>>>>> thanks
>>>>> - - -- --- ----- -------- -------------
>>>>> Jacek Lewandowski
>>>>> 
>>>>> 
>>>>> czw., 9 mar 2023 o 18:07 Miklosovic, Stefan <stefan.mikloso...@netapp.com 
>>>>> <mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com>>
>>>>>  napisał(a):
>>>>>> 
>>>>>> Derek,
>>>>>> 
>>>>>> I have couple more points ... I do not think that extracting it to a 
>>>>>> separate repository is "win". That code is on Hadoop 1.0.3. We would be 
>>>>>> spending a lot of work on extracting it just to extract 10 years old 
>>>>>> code with occasional updates (in my humble opinion just to make it 
>>>>>> compilable again if the code around changes). What good is in that? We 
>>>>>> would have one more place to take care of ... Now we at least have it 
>>>>>> all in one place.
>>>>>> 
>>>>>> I believe we have four options:
>>>>>> 
>>>>>> 1) leave it there so it will be like this is for next years with 
>>>>>> questionable and diminishing usage
>>>>>> 2) update it to Hadoop 3.3 (I wonder who is going to do that)
>>>>>> 3) 2) and extract it to a separate repository but if we do 2) we can 
>>>>>> just leave it there
>>>>>> 4) remove it
>>>>>> 
>>>>>> ________________________________________
>>>>>> From: Derek Chen-Becker <de...@chen-becker.org 
>>>>>> <mailto:de...@chen-becker.org><mailto:de...@chen-becker.org>>
>>>>>> Sent: Thursday, March 9, 2023 15:55
>>>>>> To: dev@cassandra.apache.org 
>>>>>> <mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org>
>>>>>> Subject: Re: Role of Hadoop code in Cassandra 5.0
>>>>>> 
>>>>>> NetApp Security WARNING: This is an external email. Do not click links 
>>>>>> or open attachments unless you recognize the sender and know the content 
>>>>>> is safe.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> I think the question isn't "Who ... is still using that?" but more "are 
>>>>>> we actually going to support it?" If we're on a version that old it 
>>>>>> would appear that we've basically abandoned it, although there do appear 
>>>>>> to have been refactoring (for other things) commits in the last couple 
>>>>>> of years. I would be in favor of removal from 5.0, but at the very 
>>>>>> least, could it be moved into a separate repo/package so that it's not 
>>>>>> pulling a relatively large dependency subtree from Hadoop into our main 
>>>>>> codebase?
>>>>>> 
>>>>>> Cheers,
>>>>>> 
>>>>>> Derek
>>>>>> 
>>>>>> On Thu, Mar 9, 2023 at 6:44 AM Miklosovic, Stefan 
>>>>>> <stefan.mikloso...@netapp.com 
>>>>>> <mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>>
>>>>>>  wrote:
>>>>>> Hi list,
>>>>>> 
>>>>>> I stumbled upon Hadoop package again. I think there was some discussion 
>>>>>> about the relevancy of Hadoop code some time ago but I would like to ask 
>>>>>> this again.
>>>>>> 
>>>>>> Do you think Hadoop code (1) is still relevant in 5.0? Who in the 
>>>>>> industry is still using that?
>>>>>> 
>>>>>> We might drop a lot of code and some Hadoop dependencies too (3) (even 
>>>>>> their scope is "provided"). The version of Hadoop we build upon is 1.0.3 
>>>>>> which was released 10 years ago. This code does not have any tests nor 
>>>>>> documentation on the website.
>>>>>> 
>>>>>> There seems to be issues like this (2) and it seems like the solution is 
>>>>>> to, basically, use Spark Cassandra connector instead which I would say 
>>>>>> is quite reasonable.
>>>>>> 
>>>>>> Regards
>>>>>> 
>>>>>> (1) 
>>>>>> https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/hadoop
>>>>>> (2) https://lists.apache.org/thread/jdy5hdc2l7l29h04dqol5ylroqos1y2p
>>>>>> (3) 
>>>>>> https://github.com/apache/cassandra/blob/trunk/.build/parent-pom-template.xml#L507-L589
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> +---------------------------------------------------------------+
>>>>>> | Derek Chen-Becker                                             |
>>>>>> | GPG Key available at https://keybase.io/dchenbecker and       |
>>>>>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>>>>>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>>>>>> +---------------------------------------------------------------+

Reply via email to