GitHub user nickwallen reopened a pull request:

    https://github.com/apache/metron/pull/1012

    METRON-1551 Profiler Should Not Use Java Serialization

    When running the Profiler in a topology where serialization occurs, the 
following error happens.  This can occur when the number of workers is greater 
than 1.
    
    The topology should not be using Java serialization for serializing tuple 
values as this will negatively impact performance. 
    
    ```
    2018-05-09 10:48:35.136 o.a.s.d.executor [ERROR] 
    java.lang.RuntimeException: java.lang.RuntimeException: 
java.io.NotSerializableException: 
org.apache.metron.common.configuration.profiler.ProfileResult
     at 
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:485)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.disruptor$consume_loop_STAR_$fn__7183.invoke(disruptor.clj:83) 
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484) 
[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
    Caused by: java.lang.RuntimeException: java.io.NotSerializableException: 
org.apache.metron.common.configuration.profiler.ProfileResult
     at 
org.apache.storm.serialization.SerializableSerializer.write(SerializableSerializer.java:41)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
~[kryo-3.0.3.jar:?]
     at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
 ~[kryo-3.0.3.jar:?]
     at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
 ~[kryo-3.0.3.jar:?]
     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534) 
~[kryo-3.0.3.jar:?]
     at 
org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:44)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:44)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.daemon.worker$mk_transfer_fn$transfer_fn__7805.invoke(worker.clj:193)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.daemon.executor$start_batch_transfer__GT_worker_handler_BANG_$fn__7430.invoke(executor.clj:309)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.disruptor$clojure_handler$reify__7166.onEvent(disruptor.clj:40)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     ... 6 more
    Caused by: java.io.NotSerializableException: 
org.apache.metron.common.configuration.profiler.ProfileResult
     at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) 
~[?:1.8.0_162]
     at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) 
~[?:1.8.0_162]
     at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) 
~[?:1.8.0_162]
     at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) 
~[?:1.8.0_162]
     at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) 
~[?:1.8.0_162]
     at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) 
~[?:1.8.0_162]
     at 
org.apache.storm.serialization.SerializableSerializer.write(SerializableSerializer.java:38)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
~[kryo-3.0.3.jar:?]
     at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
 ~[kryo-3.0.3.jar:?]
     at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
 ~[kryo-3.0.3.jar:?]
     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534) 
~[kryo-3.0.3.jar:?]
     at 
org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:44)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:44)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.daemon.worker$mk_transfer_fn$transfer_fn__7805.invoke(worker.clj:193)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.daemon.executor$start_batch_transfer__GT_worker_handler_BANG_$fn__7430.invoke(executor.clj:309)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.disruptor$clojure_handler$reify__7166.onEvent(disruptor.clj:40)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     at 
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472)
 ~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
     ... 6 more
    ```
    
    ### Changes
    
    * Defined a configuration property `topology.kryo.register` for Storm that 
specifies the classes which should use Kryo serialization.  Storm requires that 
you specifically tell it which classes should undergo Kryo serialization.  
    
    * If an advanced user creates a profile that results in a user-defined type 
(a class not already in `topology.kryo.register`) this property allows them to 
register the class for Kryo serialization.  This could happen if a user is 
using a 3rd party library containing Stellar functions.
    
    * Updated the existing integration tests to force them to serialize tuples 
and to fail if Java serialization is used.
    
    * Created another integration test that creates a profile that uses the 
STATS library.  This ensures that the STATS library can be Kryo serialized.
    
    * Updated the README to reflect this change.
    
    * Updated all Profiler classes to be Java serializable in case a user 
chooses to use Java serialization in their Storm topology.  This is not 
recommended, but there is no reason to block a user from doing so.
    
    ### Manual Testing
    
    1. Launch the development environment.
    
    1. Alter the Profiler properties. 
    
        Set the Profiler duration to 1 minute.
        ```
        profiler.period.duration=1
        profiler.period.duration.units=MINUTES
        ```
    
        Force the topology to fail if Kryo serialization is not used.
        ```
        topology.fall.back.on.java.serialization=false
        ```
    
        Force the topology to serialize values, even with a single worker.
        ```
        topology.testing.always.try.serialize=true
        ```
    
    1. Restart the Profiler.
    
        ```
        source /etc/default/metron
        storm kill profiler
        $METRON_HOME/bin/start_profiler_topology.sh
        ```
    
    1. Create a Profile that uses the STATS library.
    
        ```
        $METRON_HOME/bin/stellar -z $ZOOKEEPER
        ```
    
        ```
        [Stellar]>>> conf := SHELL_EDIT()
        {
          "profiles": [
            {
              "profile": "profile-with-stats",
              "foreach": "'global'",
              "init":   { "stats": "STATS_INIT()" },
              "update": { "stats": "STATS_ADD(stats, 1)" },
              "result": "stats"
            }
          ],
          "timestampField": "timestamp"
        }
        [Stellar]>>> CONFIG_PUT("PROFILER",conf)
        ```
    
    1. Wait until a flush event occurs, then attempt to retrieve the profile 
values.  If values can be retrieved, the change has been successful.
    
        ```
        [Stellar]>>> stats := PROFILE_GET("profile-with-stats", "global", 
PROFILE_FIXED(30, "DAYS"))
        [org.apache.metron.statistics.OnlineStatisticsProvider@c8bab446, 
org.apache.metron.statistics.OnlineStatisticsProvider@2990520a, 
org.apache.metron.statistics.OnlineStatisticsProvider@f6affafd, 
org.apache.metron.statistics.OnlineStatisticsProvider@de0d511, 
org.apache.metron.statistics.OnlineStatisticsProvider@9247daa4, 
org.apache.metron.statistics.OnlineStatisticsProvider@ddef62a6, 
org.apache.metron.statistics.OnlineStatisticsProvider@2e6874a6, 
org.apache.metron.statistics.OnlineStatisticsProvider@8568c433, 
org.apache.metron.statistics.OnlineStatisticsProvider@dc65d27e, 
org.apache.metron.statistics.OnlineStatisticsProvider@28b2e728]
        ```
        ```
        [Stellar]>>> STATS_MEAN(STATS_MERGE(stats))
        1.0
        ```
    
    
    
     
    
    
    
    ## Pull Request Checklist
    
    - [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
    - [x] Does your PR title start with METRON-XXXX where XXXX is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
    - [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?
    - [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
    - [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
    - [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
    - [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
    - [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
    - [x] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nickwallen/metron METRON-1551

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/metron/pull/1012.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1012
    
----
commit 387cd2de46b6a2392d1fe7a537fc8107b3bf4a59
Author: Nick Allen <nick@...>
Date:   2018-05-11T19:22:00Z

    METRON-1551 Profiler Should Not Use Java Serialization

commit 2898e30dc5537bdfe850462fcb98a32f176acd47
Author: Nick Allen <nick@...>
Date:   2018-05-12T18:31:44Z

    Fixed a broken hyperlink in the README

commit 788b86166e867d99c3e499b04a3f2480a05b3ddf
Author: Nick Allen <nick@...>
Date:   2018-05-14T16:28:39Z

    Made all Profiler classes serializable in case a user chooses Java 
serialization in their topology.  Test cases added to validate

commit db9446e1cbf73cd1e853c2d7af65169882ca9044
Author: Nick Allen <nick@...>
Date:   2018-05-14T17:43:47Z

    Added Java serializable check to test classes and also added 'Serializable' 
where applicable

commit 13f9a2dad60b7e9b6cd8a31302d926312cc8c66f
Author: Nick Allen <nick@...>
Date:   2018-05-14T18:39:30Z

    Added extra details in error message

----


---

Reply via email to