[GitHub] [incubator-druid] mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big
mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big URL: https://github.com/apache/incubator-druid/issues/7597#issuecomment-514067508 I have this error again. One interval contains `3,130 segments` and the generated specification is huge. ``` /opt/druid/current/log/druid.log:2019-07-22T13:16:23,702 ERROR [MaterializedViewSupervisor-mv_some-ds] org.apache.druid.java.util.emitter.core.HttpPostEmitter - Event too large to emit (340,420,078 > 1,047,552): {"feed":"alerts","timestamp":"2019-07-22T13:16:22.627Z","service":"druid/overlord","host":"ec2-34-240-16-186.eu-west-1.compute.amazonaws.com:8090","version":"0.13.0-incubating","severity":"component-failure","description":"uncaught exception in MaterializedViewSupervisor-mv_some-ds.","data":{"class":"org.apache.druid.indexing.materializedview.MaterializedViewSupervisor","exceptionType":"java.lang.RuntimeException","exceptionMessage":"java.lang.RuntimeException: org.skife.jdbi.v2.exceptions.CallbackFailedException: org.skife.jdbi.v2.exceptions.UnableToExecuteStatementException: com.mysql.jdbc.PacketTooBigException: Packet for query is too large (29397252 > 16384000). You can change this value on the server by setting the max_allowed_packet' variable. [statement:\"INSERT INTO druid_tasks (id, created_date, datasource, payload, active, status_payload) VALUES (:id, :created_date, :datasource, :payload, :active, :status_payload)\", located:\"INSERT INTO druid_tasks (id, created_date, ... ``` I am going to increase the value of `maxZnodeBytes` to `32MB` but I don't think that this is the best solution. Why don't you compress the spec or maybe, even better, store it in DB (instead of sending it to ZK)? Thank you! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big
mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big URL: https://github.com/apache/incubator-druid/issues/7597#issuecomment-490081669 I have increased the `druid.indexer.runner.maxZnodeBytes` value to `2.5MB` and everything worked fine until the Supervisor reached an interval with many segments (the [created JSON](https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/overlord/RemoteTaskRunner.java#L863) has ~ `1.5MB`). Starting from this point, the `Waiting Tasks - Tasks waiting on locks` section in Overlord was filled with dozens of MV tasks (for the same data source: `index_materialized_view_test_2019-05-07...`). The error: ``` ERROR [LeaderSelector[/druid/druid-prod/overlord/_OVERLORD]] org.apache.druid.curator.discovery.CuratorDruidLeaderSelector - listener becomeLeader() failed. Unable to become leader: {class=org.apache.druid.curator.discovery.CuratorDruidLeaderSelector, exceptionType=class org.apache.druid.java.util.common.ISE, exceptionMessage=Could not reacquire lock on interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] version[2019-05-07T11:24:03.543Z] for task: index_materialized_view_test_2019-05-07T11:16:05.026Z} org.apache.druid.java.util.common.ISE: Could not reacquire lock on interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] version[2019-05-07T11:24:03.543Z] for task: index_materialized_view_test_2019-05-07T11:16:05.026Z at org.apache.druid.indexing.overlord.TaskLockbox.syncFromStorage(TaskLockbox.java:171) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:109) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.curator.discovery.CuratorDruidLeaderSelector$1.isLeader(CuratorDruidLeaderSelector.java:98) [druid-server-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:665) [curator-recipes-4.0.0.jar:4.0.0] at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:661) [curator-recipes-4.0.0.jar:4.0.0] at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-4.0.0.jar:4.0.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201] ``` Because of this error, the Overlord has stopped to respond. Also, all those waiting tasks have the same payload. If I don't suspend the Supervisor, the waiting tasks are keep growing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big
mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big URL: https://github.com/apache/incubator-druid/issues/7597#issuecomment-490081669 I have increased the `druid.indexer.runner.maxZnodeBytes` value to `2.5MB` and everything worked fine until the Supervisor reached an interval with many segments (the [created JSON](https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/overlord/RemoteTaskRunner.java#L863) has ~ `1.5MB`). Starting from this point, the `Waiting Tasks - Tasks waiting on locks` section in Overlord was filled with dozens of MV tasks (for the same data source: `index_materialized_view_test_2019-05-07...`). The error: ``` ERROR [LeaderSelector[/druid/druid-prod/overlord/_OVERLORD]] org.apache.druid.curator.discovery.CuratorDruidLeaderSelector - listener becomeLeader() failed. Unable to become leader: {class=org.apache.druid.curator.discovery.CuratorDruidLeaderSelector, exceptionType=class org.apache.druid.java.util.common.ISE, exceptionMessage=Could not reacquire lock on interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] version[2019-05-07T11:24:03.543Z] for task: index_materialized_view_test_2019-05-07T11:16:05.026Z} org.apache.druid.java.util.common.ISE: Could not reacquire lock on interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] version[2019-05-07T11:24:03.543Z] for task: index_materialized_view_test_2019-05-07T11:16:05.026Z at org.apache.druid.indexing.overlord.TaskLockbox.syncFromStorage(TaskLockbox.java:171) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:109) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.curator.discovery.CuratorDruidLeaderSelector$1.isLeader(CuratorDruidLeaderSelector.java:98) [druid-server-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:665) [curator-recipes-4.0.0.jar:4.0.0] at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:661) [curator-recipes-4.0.0.jar:4.0.0] at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-4.0.0.jar:4.0.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201] ``` Because of this error, the Overlord has stopped to respond. Also, all those waiting tasks have the same payload. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big
mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big URL: https://github.com/apache/incubator-druid/issues/7597#issuecomment-490081669 I have increased the `druid.indexer.runner.maxZnodeBytes` value to `2.5MB` and everything worked fine until the Supervisor reached an interval with many segments (the [created JSON](https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/overlord/RemoteTaskRunner.java#L863) has ~ `1.5MB`). Starting from this point, the `Waiting Tasks - Tasks waiting on locks` section in Overlord was filled with dozens of MV tasks (for the same data source: `index_materialized_view_test_2019-05-07...`). The error: ``` ERROR [LeaderSelector[/druid/druid-prod/overlord/_OVERLORD]] org.apache.druid.curator.discovery.CuratorDruidLeaderSelector - listener becomeLeader() failed. Unable to become leader: {class=org.apache.druid.curator.discovery.CuratorDruidLeaderSelector, exceptionType=class org.apache.druid.java.util.common.ISE, exceptionMessage=Could not reacquire lock on interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] version[2019-05-07T11:24:03.543Z] for task: index_materialized_view_test_2019-05-07T11:16:05.026Z} org.apache.druid.java.util.common.ISE: Could not reacquire lock on interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] version[2019-05-07T11:24:03.543Z] for task: index_materialized_view_test_2019-05-07T11:16:05.026Z at org.apache.druid.indexing.overlord.TaskLockbox.syncFromStorage(TaskLockbox.java:171) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:109) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.curator.discovery.CuratorDruidLeaderSelector$1.isLeader(CuratorDruidLeaderSelector.java:98) [druid-server-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:665) [curator-recipes-4.0.0.jar:4.0.0] at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:661) [curator-recipes-4.0.0.jar:4.0.0] at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-4.0.0.jar:4.0.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201] ``` Because of this error, the Overlord has stopped to respond. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org