[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691817#comment-15691817
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/24/16 1:03 AM:


The link you reference re virtual calls.

This is much more about monomorphic call or polymorphic calls. Making a class 
that implements an interface final, where the method invocation is by interface 
methods, does not change this.

This is more to do with the number of class's loaded/invoked that implement the 
interface. 

So in case of single implementation being used and loaded your jvm you have a 
monomorphic case for the interface, the JVM will inline this (final or not).

If you happen to have two implementations being used and loaded the jvm will 
still be able to inline but will create a branch case, the second loaded 
implementation will be slower if invoked due to the branch.

If you have more than two implementations loaded the JVM will on loading these 
do on stack replacement of the previously loaded inlined, and move to using 
virtual(jump) tables.


You'll see this occur if you turn on -XX:+PrintCompilation

A classical implementation/test and write up showing this is:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note taking the code in the blog, and running it with or without final 
implementations makes no difference.

Also i've taken this test from the above blog, for your final and non final 
cases (i've attached to this jira), if you note I've uploaded two versions, one 
with the final being declared and loaded by the JVM first and vice versa. As 
you note in both the implementation loaded first due to the inlined branch will 
be more performant.

On checking your original test case we noted that the FinalByteArraySerializer 
version runs first (due to alphabetic ordering that test are run in) , as such 
it would be always the first in the inline branch benefitting from this, this 
would explain why it seems always final was negligible faster when running your 
benchmark test case.



was (Author: michael.andre.pearce):
The link you reference re virtual calls.

This is much more about monomorphic call or polymorphic calls. Making a class 
that implements an interface final, where the method invocation is by interface 
methods, does not change this.

This is more to do with the number of class's loaded/invoked that implement the 
interface. 

So in case of single implementation being used and loaded your jvm you have a 
monomorphic case for the interface, the JVM will inline this (final or not).

If you happen to have two implementations being used and loaded the jvm will 
still be able to inline but will create a branch case, the second loaded 
implementation will be slower if invoked due to the branch.

If you have more than two implementations loaded the JVM will on loading these 
do on stack replacement of the previously loaded inlined, and move to using 
virtual tables.


You'll see this occur if you turn on -XX:+PrintCompilation

A classical implementation/test and write up showing this is:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note taking the code in the blog, and running it with or without final 
implementations makes no difference.

Also i've taken this test from the above blog, for your final and non final 
cases (i've attached to this jira), if you note I've uploaded two versions, one 
with the final being declared and loaded by the JVM first and vice versa. As 
you note in both the implementation loaded first due to the inlined branch will 
be more performant.

On checking your original test case we noted that the FinalByteArraySerializer 
version runs first (due to alphabetic ordering that test are run in) , as such 
it would be always the first in the inline branch benefitting from this, this 
would explain why it seems always final was negligible faster when running your 
benchmark test case.


> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
> Attachments: FinalTest.java, FinalTestReversed.java
>
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691817#comment-15691817
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/24/16 12:59 AM:
-

The link you reference re virtual calls.

This is much more about monomorphic call or polymorphic calls. Making a class 
that implements an interface final, where the method invocation is by interface 
methods, does not change this.

This is more to do with the number of class's loaded/invoked that implement the 
interface. 

So in case of single implementation being used and loaded your jvm you have a 
monomorphic case for the interface, the JVM will inline this (final or not).

If you happen to have two implementations being used and loaded the jvm will 
still be able to inline but will create a branch case, the second loaded 
implementation will be slower if invoked due to the branch.

If you have more than two implementations loaded the JVM will on loading these 
do on stack replacement of the previously loaded inlined, and move to using 
virtual tables.


You'll see this occur if you turn on -XX:+PrintCompilation

A classical implementation/test and write up showing this is:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note taking the code in the blog, and running it with or without final 
implementations makes no difference.

Also i've taken this test from the above blog, for your final and non final 
cases (i've attached to this jira), if you note I've uploaded two versions, one 
with the final being declared and loaded by the JVM first and vice versa. As 
you note in both the implementation loaded first due to the inlined branch will 
be more performant.

On checking your original test case we noted that the FinalByteArraySerializer 
version runs first (due to alphabetic ordering that test are run in) , as such 
it would be always the first in the inline branch benefitting from this, this 
would explain why it seems always final was negligible faster when running your 
benchmark test case.



was (Author: michael.andre.pearce):
The link you reference re virtual calls.

This is much more about monomorphic call or polymorphic calls. Making a class 
that implements an interface final, where the method invocation is by interface 
methods, does not change this.

This is more to do with the number of class's loaded/invoked that implement the 
interface. 

So in case of single implementation being used and loaded your jvm you have a 
monomorphic case for the interface, the JVM will inline this (final or not).

If you happen to have two implementations being used and loaded the jvm will 
still be able to inline but will create a branch case, the second loaded 
implementation will be slower if invoked due to the branch.

If you have more than two implementations loaded the JVM will on loading these 
do on stack replacement of the previously loaded inlined, and move to using 
virtual tables.


You'll see this occur if you turn on -XX:+PrintCompilation

A classical implementation and write up showing this is:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note taking the code, and running it with or without final 
implementations makes no difference.

Also i've taken this classical test, for your final and non final cases 
(attached to this jira), if you note I've loaded two versions, one with the 
final being declared and loaded by the JVM first and vice versa. As you note in 
both the implementation loaded first due to the inlined branch will be more 
performant.

On checking your original test case we noted that the FinalByteArraySerializer 
version runs first (due to alphabetic ordering that test are run in) , as such 
it would be always the first in the inline branch benefitting from this, this 
would explain why it seems always final was negligible faster when running your 
benchmark test case.


> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
> Attachments: FinalTest.java, FinalTestReversed.java
>
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691817#comment-15691817
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/24/16 12:57 AM:
-

The link you reference re virtual calls.

This is much more about monomorphic call or polymorphic calls. Making a class 
that implements an interface final, where the method invocation is by interface 
methods, does not change this.

This is more to do with the number of class's loaded/invoked that implement the 
interface. 

So in case of single implementation being used and loaded your jvm you have a 
monomorphic case for the interface, the JVM will inline this (final or not).

If you happen to have two implementations being used and loaded the jvm will 
still be able to inline but will create a branch case, the second loaded 
implementation will be slower if invoked due to the branch.

If you have more than two implementations loaded the JVM will on loading these 
do on stack replacement of the previously loaded inlined, and move to using 
virtual tables.


You'll see this occur if you turn on -XX:+PrintCompilation

A classical implementation and write up showing this is:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note taking the code, and running it with or without final 
implementations makes no difference.

Also i've taken this classical test, for your final and non final cases 
(attached to this jira), if you note I've loaded two versions, one with the 
final being declared and loaded by the JVM first and vice versa. As you note in 
both the implementation loaded first due to the inlined branch will be more 
performant.

On checking your original test case we noted that the FinalByteArraySerializer 
version runs first (due to alphabetic ordering that test are run in) , as such 
it would be always the first in the inline branch benefitting from this, this 
would explain why it seems always final was negligible faster when running your 
benchmark test case.



was (Author: michael.andre.pearce):
The link you reference re virtual calls.

This is much more about monomorphic call or polymorphic calls. Making a class 
that implements an interface final, where the method invocation is by interface 
methods, does not change this.

This is more to do with the number of class's loaded that implement the 
interface. 

So in case of single implementation being used and loaded your jvm you have a 
monomorphic case for the interface, the JVM will inline this (final or not).

If you happen to have two implementations being used and loaded the jvm will 
still be able to inline but will create a branch case, the second loaded 
implementation will be slower if invoked due to the branch.

If you have more than two implementations loaded the JVM will on loading these 
do on stack replacement of the previously loaded inlined, and move to using 
virtual tables.


You'll see this occur if you turn on -XX:+PrintCompilation

A classical implementation and write up showing this is:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note taking the code, and running it with or without final 
implementations makes no difference.

Also i've taken this classical test, for your final and non final cases 
(attached to this jira), if you note I've loaded two versions, one with the 
final being declared and loaded by the JVM first and vice versa. As you note in 
both the implementation loaded first due to the inlined branch will be more 
performant.

On checking your original test case we noted that the FinalByteArraySerializer 
version runs first (due to alphabetic ordering that test are run in) , as such 
it would be always the first in the inline branch benefitting from this, this 
would explain why it seems always final was negligible faster when running your 
benchmark test case.


> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
> Attachments: FinalTest.java, FinalTestReversed.java
>
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691817#comment-15691817
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/24/16 12:55 AM:
-

The link you reference re virtual calls.

This is much more about monomorphic call or polymorphic calls. Making a class 
that implements an interface final, where the method invocation is by interface 
methods, does not change this.

This is more to do with the number of class's loaded that implement the 
interface. 

So in case of single implementation being used and loaded your jvm you have a 
monomorphic case for the interface, the JVM will inline this (final or not).

If you happen to have two implementations being used and loaded the jvm will 
still be able to inline but will create a branch case, the second loaded 
implementation will be slower if invoked due to the branch.

If you have more than two implementations loaded the JVM will on loading these 
do on stack replacement of the previously loaded inlined, and move to using 
virtual tables.


You'll see this occur if you turn on -XX:+PrintCompilation

A classical implementation and write up showing this is:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note taking the code, and running it with or without final 
implementations makes no difference.

Also i've taken this classical test, for your final and non final cases 
(attached to this jira), if you note I've loaded two versions, one with the 
final being declared and loaded by the JVM first and vice versa. As you note in 
both the implementation loaded first due to the inlined branch will be more 
performant.

On checking your original test case we noted that the FinalByteArraySerializer 
version runs first (due to alphabetic ordering that test are run in) , as such 
it would be always the first in the inline branch benefitting from this, this 
would explain why it seems always final was negligible faster when running your 
benchmark test case.



was (Author: michael.andre.pearce):
The link you reference re virtual calls.

This is much more about monomorphic call or polymorphic calls. Making a class 
that implements an interface final, where the method invocation is by interface 
methods, does not change this.

This is more to do with the number of class's loaded that implement the 
interface. 

So in case of single implementation being used and loaded your jvm you have a 
monomorphic case for the interface, the JVM will inline this (final or not).

If you happen to have two implementations being used and loaded the jvm will 
still be able to inline but will create a branch case, the second loaded 
implementation will be slower if invoked due to the branch.

If you have more than two implementations loaded the JVM will on loading these 
do on stack replacement of the previously loaded inlined, and move to using 
virtual tables.


You'll see this occur if you turn on -XX:+PrintCompilation

A classical implementation and write up showing this is:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note taking the code, and running it with or without final 
implementations makes no difference.

Also i've taken this classical test, for your final and non final cases 
(attached to this jira), if you note I've loaded two versions, one with the 
final being declared and loaded by the JVM first and vice versa. As you not in 
both the implementation loaded first due to the inlined branch will be more 
performant.

On checking your original test case we noted that the FinalByteArraySerializer 
version runs first (due to alphabetic ordering that test are run in) , as such 
it would be always the first in the inline branch benefitting from this, this 
would explain why it seems always final was negligible faster when running your 
benchmark test case.


> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
> Attachments: FinalTest.java, FinalTestReversed.java
>
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691817#comment-15691817
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/24/16 12:54 AM:
-

The link you reference re virtual calls.

This is much more about monomorphic call or polymorphic calls. Making a class 
that implements an interface final, where the method invocation is by interface 
methods, does not change this.

This is more to do with the number of class's loaded that implement the 
interface. 

So in case of single implementation being used and loaded your jvm you have a 
monomorphic case for the interface, the JVM will inline this (final or not).

If you happen to have two implementations being used and loaded the jvm will 
still be able to inline but will create a branch case, the second loaded 
implementation will be slower if invoked due to the branch.

If you have more than two implementations loaded the JVM will on loading these 
do on stack replacement of the previously loaded inlined, and move to using 
virtual tables.


You'll see this occur if you turn on -XX:+PrintCompilation

A classical implementation and write up showing this is:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note taking the code, and running it with or without final 
implementations makes no difference.

Also i've taken this classical test, for your final and non final cases 
(attached to this jira), if you note I've loaded two versions, one with the 
final being declared and loaded by the JVM first and vice versa. As you not in 
both the implementation loaded first due to the inlined branch will be more 
performant.

On checking your original test case we noted that the FinalByteArraySerializer 
version runs first (due to alphabetic ordering that test are run in) , as such 
it would be always the first in the inline branch benefitting from this, this 
would explain why it seems always final was negligible faster when running your 
benchmark test case.



was (Author: michael.andre.pearce):
The link you reference re virtual calls.

This is much more about monomorphic call or polymorphic calls. Making a class 
that implements an interface final, where the method invocation is by interface 
methods, does not change this.

This is more to do with the number of class's loaded that implement the 
interface. 

So in case of single implementation being used and loaded your jvm you have a 
monomorphic case for the interface, the JVM will inline this (final or not).

If you happen to have two implementations being used and loaded the jvm will 
still be able to inline but will create a branch case, the second loaded 
implementation will be slower if invoked due to the branch.

If you have more than two implementations loaded the JVM will on loading these 
do on stack replacement of the previously loaded inlined, and more to using 
visual tables.


You'll see this occur if you turn on -XX:+PrintCompilation

A classical implementation and write up showing this is:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note taking the code, and running it with or without final 
implementations makes no difference.

Also i've taken this classical test, for your final and non final cases 
(attached to this jira), if you note I've loaded two versions, one with the 
final being declared and loaded by the JVM first and vice versa. As you not in 
both the implementation loaded first due to the inlined branch will be more 
performant.

On checking your original test case we noted that the FinalByteArraySerializer 
version runs first (due to alphabetic ordering that test are run in) , as such 
it would be always the first in the inline branch benefitting from this, this 
would explain why it seems always final was negligible faster when running your 
benchmark test case.


> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
> Attachments: FinalTest.java, FinalTestReversed.java
>
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691736#comment-15691736
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/24/16 12:12 AM:
-

Just reading the link you reference.

Virtual calls and jump tables is monomorphic vs polymorphic calls.

If only one implementation(this implementation being final or not) of an 
interface is loaded then you will get a monomorphic implementation and it can 
be fully inlined. 

If you load another, your method is inlined but with a branch, once you load 
further hotspot will do on stack replacement and this is when jump tables are 
needed.

You can see this occurring with on stack replacement with 
-XX:+PrintCompilation, as classes are loaded.

A classical test case showing this:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note that making the class's final or not in the test, makes no 
difference to the outcome.

In case of serialisers/deserialisers this is our case as we have an interface 
that is implemented, as such making these implementations final doesn't help, 
as all reference by calling classes are invoking methods on the interface, as 
such its performance is about how many implementations are loaded.


was (Author: michael.andre.pearce):
Just reading the link you reference.

Virtual calls and jump tables is monomorphic vs polymorphic calls.

If only one implementation(this implementation being final or not) of an 
interface is loaded then you will get a monomorphic implementation and it can 
be fully inlined. 

If you load another, your method is inlined but with a branch, once you load 
further hotspot will get on stack replacement and this is when jump tables are 
needed.

You can see this occurring with on stack replacement with 
-XX:+PrintCompilation, as classes are loaded.

A classical test case showing this:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note that making the class's final or not in the test, makes no 
difference to the outcome.

In case of serialisers/deserialisers this is our case as we have an interface 
that is implemented, as such making these implementations final doesn't help, 
as all reference by calling classes are invoking methods on the interface, as 
such its performance is about how many implementations are loaded.

> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691736#comment-15691736
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/24/16 12:12 AM:
-

Just reading the link you reference.

Virtual calls and jump tables is monomorphic vs polymorphic calls.

If only one implementation(this implementation being final or not) of an 
interface is loaded then you will get a monomorphic implementation and it can 
be fully inlined. 

If you load another, your method is inlined but with a branch, once you load 
further hotspot will get on stack replacement and this is when jump tables are 
needed.

You can see this occurring with on stack replacement with 
-XX:+PrintCompilation, as classes are loaded.

A classical test case showing this:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note that making the class's final or not in the test, makes no 
difference to the outcome.

In case of serialisers/deserialisers this is our case as we have an interface 
that is implemented, as such making these implementations final doesn't help, 
as all reference by calling classes are invoking methods on the interface, as 
such its performance is about how many implementations are loaded.


was (Author: michael.andre.pearce):
Just reading the link you reference.

Virtual calls and jump tables is monomorphic vs polymorphic calls.

If only one implementation(this implementation being final or not) of an 
interface is loaded then you will get a monomorphic implementation and it can 
be fully inlined. 

If you load another, your method is inlined but with a branch, once you load 
further hotspot will get on stack replacement and this is when jump tables are 
needed.

You can see this occurring with on stack replacement with 
-XX:+PrintCompilation, as classes are loaded.

A classical test case showing this:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note that making the class's final or not in the test, makes no 
difference to the outcome.

In case of serialisers/deserialisers this is our case as we have an interface 
that is implemented, as such making these implementations final doesn't do much.

> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691571#comment-15691571
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/23/16 11:30 PM:
-

Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.

The above stats copied are from running in particular on a laptop (but are 
inline with what we've seen also in our server environment) just easier to copy 
from as our server environments are protected.

MacBook Pro (Retina, 15-inch, Mid 2015)
2.2 GHz Intel Core i7
16 GB 1600 MHz DDR3


This is what we have come to expect, essentially making a class final doesn't 
give as much a perfomance boost as people come to think, as the modern jvm 
compilers like hotspot (there is another commercial one which I'm sure we all 
know of ;) but as its propriety/commercial shall avoid it for this discussion 
), they really do a lot of magic under the hood for us.

Please read:
http://www.oracle.com/technetwork/java/whitepaper-135217.html#impact
http://www.oracle.com/technetwork/java/whitepaper-135217.html#optimizations

Whilst this article (by  Brian Goetz - his credentials kinda speak for them 
selves) from way back when is very dated and now not relevant in regards to 
precise jvm internals as a lot has moved on. 
http://www.ibm.com/developerworks/java/library/j-jtp1029/index.html

There is one core take away i think is important, and is more important when 
making a decision to use final or not.

"
final classes and methods can be a significant inconvenience when programming 
-- they limit your options for reusing existing code and extending the 
functionality of existing classes. While sometimes a class is made final for a 
good reason, such as to enforce immutability, the benefits of using final 
should outweigh the inconvenience. Performance enhancement is almost always a 
bad reason to compromise good object-oriented design principles, and when the 
performance enhancement is small or nonexistent, this is a bad trade-off indeed.
"

This is a change to API level that becomes more restrictive, then i think 
really this should need a KIP, and no doubt some further tests and arguments on 
a KIP discussion for the pros and con's. 

Its worth noting also very recently the ProducerRecord and ConsumerRecord, for 
extensibility reasons were made non-final, if you take the current master.


was (Author: michael.andre.pearce):
Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on 

[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691571#comment-15691571
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/23/16 11:28 PM:
-

Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.

The above stats copied are from running in particular on a laptop (but are 
inline with what we've seen also in our server environment) just easier to copy 
from as our server environments are protected.

MacBook Pro (Retina, 15-inch, Mid 2015)
2.2 GHz Intel Core i7
16 GB 1600 MHz DDR3


This is what we have come to expect, essentially making a class final doesn't 
give as much a perfomance boost as people come to think, as the modern jvm 
compilers like hotspot (there is another commercial one which I'm sure we all 
know of ;) but as its propriety/commercial shall avoid it for this discussion 
), they really do a lot of magic under the hood for us.

Please read:
http://www.oracle.com/technetwork/java/whitepaper-135217.html#impact

Whilst this article (by  Brian Goetz - his credentials kinda speak for them 
selves) from way back when is very dated and now not relevant in regards to 
precise jvm internals as a lot has moved on. 
http://www.ibm.com/developerworks/java/library/j-jtp1029/index.html

There is one core take away i think is important, and is more important when 
making a decision to use final or not.

"
final classes and methods can be a significant inconvenience when programming 
-- they limit your options for reusing existing code and extending the 
functionality of existing classes. While sometimes a class is made final for a 
good reason, such as to enforce immutability, the benefits of using final 
should outweigh the inconvenience. Performance enhancement is almost always a 
bad reason to compromise good object-oriented design principles, and when the 
performance enhancement is small or nonexistent, this is a bad trade-off indeed.
"

This is a change to API level that becomes more restrictive, then i think 
really this should need a KIP, and no doubt some further tests and arguments on 
a KIP discussion for the pros and con's. 

Its worth noting also very recently the ProducerRecord and ConsumerRecord, for 
extensibility reasons were made non-final, if you take the current master.


was (Author: michael.andre.pearce):
Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand 

[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691571#comment-15691571
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/23/16 11:23 PM:
-

Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.

The above stats copied are from running in particular on a laptop (but are 
inline with what we've seen also in our server environment) just easier to copy 
from as our server environments are protected.

MacBook Pro (Retina, 15-inch, Mid 2015)
2.2 GHz Intel Core i7
16 GB 1600 MHz DDR3


This is what we have come to expect, essentially making a class final doesn't 
give as much a perfomance boost as people come to think, as the modern jvm 
compilers like hotspot (there is another commercial one which I'm sure we all 
know of ;) but as its propriety/commercial shall avoid it for this discussion 
), they really do a lot of magic under the hood for us.

Whilst this article (by  Brian Goetz - his credentials kinda speak for them 
selves) from way back when is very dated and now not relevant in regards to 
precise jvm internals as a lot has moved on. 
http://www.ibm.com/developerworks/java/library/j-jtp1029/index.html

There is one core take away i think is important, and is more important when 
making a decision to use final or not.

"
final classes and methods can be a significant inconvenience when programming 
-- they limit your options for reusing existing code and extending the 
functionality of existing classes. While sometimes a class is made final for a 
good reason, such as to enforce immutability, the benefits of using final 
should outweigh the inconvenience. Performance enhancement is almost always a 
bad reason to compromise good object-oriented design principles, and when the 
performance enhancement is small or nonexistent, this is a bad trade-off indeed.
"

This is a change to API level that becomes more restrictive, then i think 
really this should need a KIP, and no doubt some further tests and arguments on 
a KIP discussion for the pros and con's. 

Its worth noting also very recently the ProducerRecord and ConsumerRecord, for 
extensibility reasons were made non-final, if you take the current master.


was (Author: michael.andre.pearce):
Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm 

[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691571#comment-15691571
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/23/16 11:22 PM:
-

Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.

The above stats copied are from running in particular on a laptop (but are 
inline with what we've seen also in our server environment) just easier to copy 
from as our server environments are protected.

MacBook Pro (Retina, 15-inch, Mid 2015)
2.2 GHz Intel Core i7
16 GB 1600 MHz DDR3


This is what we have come to expect, essentially making a class final doesn't 
give as much a perfomance boost as people come to think, as the modern jvm 
compilers like hotspot (there is another commercial one which I'm sure we all 
know of ;) but as its propriety/commercial shall avoid it for this discussion 
), they really do a lot of magic under the hood for us.

Whilst this article (by  Brian Goetz - his credentials kinda speak for them 
selves) from way back when is very dated and now not relevant in regards to 
precise jvm internals as a lot has moved on. 
http://www.ibm.com/developerworks/java/library/j-jtp1029/index.html

There is one core take away i think is important, and is more important when 
making a decision to use final or not.

"
final classes and methods can be a significant inconvenience when programming 
-- they limit your options for reusing existing code and extending the 
functionality of existing classes. While sometimes a class is made final for a 
good reason, such as to enforce immutability, the benefits of using final 
should outweigh the inconvenience. Performance enhancement is almost always a 
bad reason to compromise good object-oriented design principles, and when the 
performance enhancement is small or nonexistent, this is a bad trade-off indeed.
"

This is a change to API level, then i think really this should need a KIP, and 
no doubt some further tests and arguments on a KIP discussion for the pros and 
con's. 

Its worth noting also very recently the ProducerRecord and ConsumerRecord, for 
extensibility reasons were made non-final, if you take the current master.


was (Author: michael.andre.pearce):
Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.

The above 

[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691571#comment-15691571
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/23/16 11:12 PM:
-

Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.

The above stats copied are from running in particular on a laptop (but are 
inline with what we've seen also in our server environment) just easier to copy 
from as our server environments are protected.

MacBook Pro (Retina, 15-inch, Mid 2015)
2.2 GHz Intel Core i7
16 GB 1600 MHz DDR3


This is what we have come to expect, essentially making a class final doesn't 
give as much a perfomance boost as people come to think, as the modern jvm 
compilers like hotspot (there is another commercial one which I'm sure we all 
know of ;) but as its propriety/commercial shall avoid it for this discussion 
), they really do a lot of magic under the hood for us.

Whilst this article (by  Brian Goetz - his credentials kinda speak for them 
selves) from way back when is very dated and now not relevant in regards to 
precise jvm internals as a lot has moved on. 
http://www.ibm.com/developerworks/java/library/j-jtp1029/index.html

There is one core take away i think is important, and is more important when 
making a decision to use final or not.

"
final classes and methods can be a significant inconvenience when programming 
-- they limit your options for reusing existing code and extending the 
functionality of existing classes. While sometimes a class is made final for a 
good reason, such as to enforce immutability, the benefits of using final 
should outweigh the inconvenience. Performance enhancement is almost always a 
bad reason to compromise good object-oriented design principles, and when the 
performance enhancement is small or nonexistent, this is a bad trade-off indeed.
"

As stated, this is a change to API level, then i think really this should need 
a KIP, and no doubt some further tests and arguments on a KIP discussion for 
the pros and con's.

Its worth noting also very recently the ProducerRecord and ConsumerRecord, for 
extensibility reasons were made non-final, if you take the current master.


was (Author: michael.andre.pearce):
Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.


[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691571#comment-15691571
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/23/16 11:01 PM:
-

Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.

The above stats copied are from running in particular on a laptop (but are 
inline with what we've seen also in our server environment) just easier to copy 
from as our server environments are protected.

MacBook Pro (Retina, 15-inch, Mid 2015)
2.2 GHz Intel Core i7
16 GB 1600 MHz DDR3


This is what we have come to expect, essentially making a class final doesn't 
give as much a perfomance boost as people come to think, as the modern jvm 
compilers like hotspot (there is another commercial one which I'm sure we all 
know of ;) but as its propriety/commercial shall avoid it for this discussion 
), they really do a lot of magic under the hood for us.

Whilst this article from way back when is very dated and now not relevant in 
regards to precise jvm internals as a lot has moved on. 
http://www.ibm.com/developerworks/java/library/j-jtp1029/index.html

There is one core take away i think is important, and is more important when 
making a decision to use final or not.

"
final classes and methods can be a significant inconvenience when programming 
-- they limit your options for reusing existing code and extending the 
functionality of existing classes. While sometimes a class is made final for a 
good reason, such as to enforce immutability, the benefits of using final 
should outweigh the inconvenience. Performance enhancement is almost always a 
bad reason to compromise good object-oriented design principles, and when the 
performance enhancement is small or nonexistent, this is a bad trade-off indeed.
"

As stated, this is a change to API level, then i think really this should need 
a KIP, and no doubt some further tests and arguments on a KIP discussion for 
the pros and con's.

Its worth noting also very recently the ProducerRecord and ConsumerRecord, for 
extensibility reasons were made non-final, if you take the current master.


was (Author: michael.andre.pearce):
Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.

The above stats copied are from running in particular on a laptop 

[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691571#comment-15691571
 ] 

Michael Andre Pearce (IG) edited comment on KAFKA-4424 at 11/23/16 10:57 PM:
-

Hi Matthias,

Thanks for putting together some initial tests.

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.

The above stats copied are from running in particular on a laptop (but are 
inline with what we've seen also in our server environment) just easier to copy 
from as our server environments are protected.

MacBook Pro (Retina, 15-inch, Mid 2015)
2.2 GHz Intel Core i7
16 GB 1600 MHz DDR3


This is what we have come to expect, essentially making a class final doesn't 
give as much a speed boost as people come to think, as the modern jvm compilers 
like hotspot (there is another commercial one which I'm sure we all know of ;) 
but as its propriety/commercial shall avoid it for this discussion ), they 
really do a lot of magic under the hood for us.

Whilst this article from way back when is very dated and now not relevant in 
regards to precise jvm internals as a lot has moved on. 
http://www.ibm.com/developerworks/java/library/j-jtp1029/index.html

There is one core take away i think is important, and is more important when 
making a decision to use final or not.

"
final classes and methods can be a significant inconvenience when programming 
-- they limit your options for reusing existing code and extending the 
functionality of existing classes. While sometimes a class is made final for a 
good reason, such as to enforce immutability, the benefits of using final 
should outweigh the inconvenience. Performance enhancement is almost always a 
bad reason to compromise good object-oriented design principles, and when the 
performance enhancement is small or nonexistent, this is a bad trade-off indeed.
"

As stated, this is a change to API level, then i think really this should need 
a KIP, and no doubt some further tests and arguments on a KIP discussion for 
the pros and con's.

Its worth noting also very recently the ProducerRecord and ConsumerRecord, for 
extensibility reasons were made non-final, if you take the current master.


was (Author: michael.andre.pearce):
Hi Matthias,

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.

The above stats copied are from running in particular on a laptop (but are 
inline with what we've seen also in our 

[jira] [Comment Edited] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Matthias Bechtold (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690654#comment-15690654
 ] 

Matthias Bechtold edited comment on KAFKA-4424 at 11/23/16 4:50 PM:


Thank you for your feedback, I can totally see your point.

So I did some tests concerning the *producer* throughput, using JMH:

https://github.com/MatthiasBechtold/kafka-throughput-test

I think performance somewhat varies based on test envrionment/method.

Anyways I experienced a small performance gain after 500 seconds warmup 
(between 0.26% - 5.5%) , which is of course somewhat greater in warmup phase 
(this also helps, doesn't it?).

This is not an end-to-end test as proposed, but I did not figure out how to 
quickly implement such test. This test should also provide at least some 
evidence towards possible performance gains.


was (Author: matthiasbechtold):
Thank you for your feedback, I can totally see your point.

So I did some tests concerning the *producer* throughput, using JMH:

https://github.com/MatthiasBechtold/kafka-throughput-test

I think performance somewhat varies based on test envrionment/method.

Anyways I experienced a small performance gain (between 0,26% - 5,5%) , which 
is of course somewhat greater in warmup phase (this also helps, doesn't it?).

This is not an end-to-end test as proposed, but I did not figure out how to 
quickly implement such test. This test should also provide at least some 
evidence towards possible performance gains.

> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)