[jira] [Comment Edited] (KAFKA-14524) Modularize `core` monolith

2023-01-04 Thread Ismael Juma (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654415#comment-17654415
 ] 

Ismael Juma edited comment on KAFKA-14524 at 1/4/23 11:47 AM:
--

[~mdedetrich-aiven] Please check KAFKA-14470 where this is currently in 
progress for the storage layer and where very quick progress was achieved in a 
short period of time. Introducing scala (along with the scala version suffixes) 
to each of the new modules will make things more complicated than doing both 
changes at once.

I think it makes sense to complete that module and evaluate if any of the 
concerns that have been raised here are real or theoretical.


was (Author: ijuma):
[~mdedetrich-aiven] Please check KAFKA-14470 where this is currently in 
progress for the storage layer and where very quick progress was achieved in a 
short period of time. Introducing scala (along with the scala version suffixes) 
to each of the new modules will make things more complicated than doing both 
changes at once.

> Modularize `core` monolith
> --
>
> Key: KAFKA-14524
> URL: https://issues.apache.org/jira/browse/KAFKA-14524
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Priority: Major
>
> The `core` module has grown too large and it's time to split it into multiple 
> modules. A much slimmer `core` module will remain in the end.
> Evidence of `core` growing too large is that it takes 1m10s to compile the 
> main code and tests and it takes hours to run all the tests sequentially.
> As part of this effort, we should rewrite the Scala code in Java to reduce 
> developer friction, reduce compilation time and simplify deployment (i.e. we 
> can remove the scala version suffix from the module name). Scala may have a 
> number of advantages over Java 8 (minimum version we support now) and Java 11 
> (minimum version we will support in Kafka 4.0), but a mixture of Scala and 
> Java (as we have now) is more complex than just Java.
> Another benefit is that code dependencies will be strictly enforced, which 
> will hopefully help ensure better abstractions.
> This pattern was started with the `tools` (but not completed), `metadata` and 
> `raft` modules and we have (when this ticket was filed) a couple more in 
> progress: `group-coordinator` and `storage`.
> This is an umbrella ticket and it will link to each ticket related to this 
> goal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14524) Modularize `core` monolith

2023-01-04 Thread Matthew de Detrich (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654413#comment-17654413
 ] 

Matthew de Detrich edited comment on KAFKA-14524 at 1/4/23 11:45 AM:
-

I believe the main intention of this is to modularize the Kafka core (which I 
don't think anyone disagrees with) but the Scala removal has been tacked on 
without a good justification as to why these orthogonal concerns are done 
together. Combining such 2 massive tasks at once seems to make the scope of 
this too big (splitting out the core without migrating away from Scala has far 
less risk). I'm also wondering if the move away from Scala would qualify as 
consensus required changes, mostly to get the opinion of the community on this 
topic. For example, Flink which also decided to do such a similar change 
consulted the community via discussion threads (see 
[https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h] and 
[https://lists.apache.org/thread/j1fggrbh7hl4pfqcqqq7p527kvdgk35s] as examples).

May I suggest that instead of doing both the modularization and the Scala to 
Java migration at once, we just do the modularization and do the migration 
later once the community is consulted?


was (Author: mdedetrich-aiven):
I believe the main intention of this is to modularize the Kafka core (which I 
don't think anyone disagrees with) but the Scala removal has been tacked on 
without a good justification as to why these orthogonal concerns are done 
together. Combining such 2 massive tasks at once seems to make the scope of 
this too big (splitting out the core without migrating away from Scala has far 
less risk). I'm also wondering if the move away from Scala would qualify as 
consensus required changes, mostly to get the opinion of the community on this 
topic. For example, Flink which also decided to do such a similar change 
consulted the community via discussion threads (see 
[https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h] and 
[https://lists.apache.org/thread/j1fggrbh7hl4pfqcqqq7p527kvdgk35s] as an 
example).

May I suggest that instead of doing both the modularization and the Scala to 
Java migration at once, we just do the modularization and do the migration 
later once the community is consulted?

> Modularize `core` monolith
> --
>
> Key: KAFKA-14524
> URL: https://issues.apache.org/jira/browse/KAFKA-14524
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Priority: Major
>
> The `core` module has grown too large and it's time to split it into multiple 
> modules. A much slimmer `core` module will remain in the end.
> Evidence of `core` growing too large is that it takes 1m10s to compile the 
> main code and tests and it takes hours to run all the tests sequentially.
> As part of this effort, we should rewrite the Scala code in Java to reduce 
> developer friction, reduce compilation time and simplify deployment (i.e. we 
> can remove the scala version suffix from the module name). Scala may have a 
> number of advantages over Java 8 (minimum version we support now) and Java 11 
> (minimum version we will support in Kafka 4.0), but a mixture of Scala and 
> Java (as we have now) is more complex than just Java.
> Another benefit is that code dependencies will be strictly enforced, which 
> will hopefully help ensure better abstractions.
> This pattern was started with the `tools` (but not completed), `metadata` and 
> `raft` modules and we have (when this ticket was filed) a couple more in 
> progress: `group-coordinator` and `storage`.
> This is an umbrella ticket and it will link to each ticket related to this 
> goal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14524) Modularize `core` monolith

2023-01-04 Thread Matthew de Detrich (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654413#comment-17654413
 ] 

Matthew de Detrich edited comment on KAFKA-14524 at 1/4/23 11:44 AM:
-

I believe the main intention of this is to modularize the Kafka core (which I 
don't think anyone disagrees with) but the Scala removal has been tacked on 
without a good justification as to why these orthogonal concerns are done 
together. Combining such 2 massive tasks at once seems to make the scope of 
this too big (splitting out the core without migrating away from Scala has far 
less risk). I'm also wondering if the move away from Scala would qualify as 
consensus required changes, mostly to get the opinion of the community on this 
topic. For example, Flink which also decided to do such a similar change 
consulted the community via discussion threads (see 
[https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h] and 
[https://lists.apache.org/thread/j1fggrbh7hl4pfqcqqq7p527kvdgk35s] as an 
example).

May I suggest that instead of doing both the modularization and the Scala to 
Java migration at once, we just do the modularization and do the migration 
later once the community is consulted?


was (Author: mdedetrich-aiven):
I believe the main intention of this is to modularize the Kafka core (which I 
don't think anyone disagrees with) but the Scala removal has been tacked on 
without a good justification as to why these orthogonal concerns are done 
together. Combining such 2 massive tasks at once seems to make the scope of 
this too big (splitting out the core without migrating away from Scala has far 
less risk). I'm also wondering if the move away from Scala would qualify as 
consensus required changes, mostly to get the opinion of the community on this 
topic. For example, Flink which also decided to do such a similar change 
consulted the community via discussion threads (see 
https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h as an example).

May I suggest that instead of doing both the modularization and the Scala to 
Java migration at once, we just do the modularization and do the migration 
later once the community is consulted?

> Modularize `core` monolith
> --
>
> Key: KAFKA-14524
> URL: https://issues.apache.org/jira/browse/KAFKA-14524
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Priority: Major
>
> The `core` module has grown too large and it's time to split it into multiple 
> modules. A much slimmer `core` module will remain in the end.
> Evidence of `core` growing too large is that it takes 1m10s to compile the 
> main code and tests and it takes hours to run all the tests sequentially.
> As part of this effort, we should rewrite the Scala code in Java to reduce 
> developer friction, reduce compilation time and simplify deployment (i.e. we 
> can remove the scala version suffix from the module name). Scala may have a 
> number of advantages over Java 8 (minimum version we support now) and Java 11 
> (minimum version we will support in Kafka 4.0), but a mixture of Scala and 
> Java (as we have now) is more complex than just Java.
> Another benefit is that code dependencies will be strictly enforced, which 
> will hopefully help ensure better abstractions.
> This pattern was started with the `tools` (but not completed), `metadata` and 
> `raft` modules and we have (when this ticket was filed) a couple more in 
> progress: `group-coordinator` and `storage`.
> This is an umbrella ticket and it will link to each ticket related to this 
> goal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14524) Modularize `core` monolith

2023-01-04 Thread Ismael Juma (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654358#comment-17654358
 ] 

Ismael Juma edited comment on KAFKA-14524 at 1/4/23 9:08 AM:
-

I disagree that the risk is high. We have already done this once before (both 
the protocol layer and the record layer were converted to Java when the Java 
clients were introduced) and the evidence shows that project can make this type 
of change with minimal negative impact (our extensive tests help a lot).

I am deeply familiar with the low level details of both Java and Scala and I 
also don't foresee any issue regarding the bytecode translation you raised.


was (Author: ijuma):
I disagree that the risk is high. We have already done this once before (both 
the protocol layer and the record layer were converted to Java when the Java 
clients were introduced) and the evidence shows that project can make this type 
of change with minimal negative impact (our extensive tests help a lot).

 

> Modularize `core` monolith
> --
>
> Key: KAFKA-14524
> URL: https://issues.apache.org/jira/browse/KAFKA-14524
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Priority: Major
>
> The `core` module has grown too large and it's time to split it into multiple 
> modules. A much slimmer `core` module will remain in the end.
> Evidence of `core` growing too large is that it takes 1m10s to compile the 
> main code and tests and it takes hours to run all the tests sequentially.
> As part of this effort, we should rewrite the Scala code in Java to reduce 
> developer friction, reduce compilation time and simplify deployment (i.e. we 
> can remove the scala version suffix from the module name). Scala may have a 
> number of advantages over Java 8 (minimum version we support now) and Java 11 
> (minimum version we will support in Kafka 4.0), but a mixture of Scala and 
> Java (as we have now) is more complex than just Java.
> Another benefit is that code dependencies will be strictly enforced, which 
> will hopefully help ensure better abstractions.
> This pattern was started with the `tools` (but not completed), `metadata` and 
> `raft` modules and we have (when this ticket was filed) a couple more in 
> progress: `group-coordinator` and `storage`.
> This is an umbrella ticket and it will link to each ticket related to this 
> goal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)