+1 nb. I too see these tools (bulk analytics and scc) as complementary as has been said. SCC also does some nice things to support Spark Streaming that I don't think are addressed by the bulk analytics subproject today.
Regarding dsbulk, I think that's another thread but it's something we're looking at as well. It has a lower barrier to entry for sure, but it doesn't plug into the full Spark ecosystem for those that need it. > On Jun 24, 2024, at 3:40 PM, Abe Ratnofsky <a...@aber.io> wrote: > > Likewise - another vote in favor of bringing in this subproject. > > Any thoughts on bringing in dsbulk as well? dsbulk has a lower barrier to > entry than Spark Cassandra Connector, addresses a real need for users, and > appears to be at a similar place in its project lifecycle. > > Abe > >> On Jun 24, 2024, at 4:36 PM, Francisco Guerrero <fran...@apache.org> wrote: >> >> Yeah, having the connector will enhance the Cassandra ecosystem. I'm looking >> forward to this contribution. >> >> On 2024/06/24 17:28:48 "C. Scott Andreas" wrote: >>> Supportive of accepting a donation of the Spark Cassandra Connector under >>> the project's umbrella as well - I think that would be very welcome and >>> appreciated. Spark Cassandra Connector and the Analytics library are also >>> suited to slightly different usage patterns. SCC can be a good fit for >>> Spark jobs that operate with a high degree of selectivity; vs. larger bulk >>> scoops. – Scott On Jun 24, 2024, at 1:29 AM, Jon Haddad >>> <j...@jonhaddad.com> wrote: I also think it would be a great contribution, >>> especially since the bulk analytics library can’t be used by the majority >>> of teams, since it’s hard coded to only work with single token clusters. On >>> Mon, Jun 24, 2024 at 9:51 AM Dinesh Joshi < djo...@apache.org > wrote: This >>> would be a great contribution to have for the Analytics subproject. The >>> current bulk functionality in the Analytics subproject complements the >>> spark-cassandra-connector so I see it as a good fit for donation. On Mon, >>> Jun 24, 2024 at 12:32 AM Mick Semb Wever < m...@apache.org > wrote: What >>> are folks thoughts on accepting a donation of the spark-cassandra-connector >>> project into the Analytics subproject ? A number of folks have requested >>> this, stating that they cannot contribute to the project while it is under >>> DataStax. The project has largely been in maintenance mode the past few >>> years. Under ASF I believe that it will attract more attention and >>> contributions, and offline discussions I have had indicate that the >>> spark-cassandra-connector remains an important complement to the bulk >>> analytics component. >