Can you share your entire log file. Why your operator got killed for first
time? The above mentioned error seems to be at recovery time.

-Priyanka

On Tue, Jun 6, 2017 at 12:44 AM, Guilherme Hott <[email protected]>
wrote:

> Hi, I have this error and I don't know why it's happening. The operator
> who is failing is processing a tuple, doing a dedup check, saving into
> HBase if it's new or update and emiting to the stream. But, because of
> this, only a few tuples are processed due to the failure.
>
> 2017-06-04 06:43:45,265 
> [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl
>>> #6] INFO  impl.ContainerManagementProtocolProxy newProxy - Opening
>>> proxy : localhost:8052
>>
>> 2017-06-04 06:43:47,573 [IPC Server handler 0 on 38848] INFO  
>> stram.StreamingContainerParent
>>> log - child msg: [container_1496564390452_0002_01_000010] Entering
>>> heartbeat loop.. context: PTContainer[id=1(container_
>>> 1496564390452_0002_01_000010),state=ALLOCATED,operators=[
>>> PTOperator[id=6,name=ConsoleNew], PTOperator[id=7,name=ConsoleBad],
>>> PTOperator[id=1,name=cloutApiBanksInput], 
>>> PTOperator[id=5,name=banksDeduplicator],
>>> PTOperator[id=10,name=ConsoleNewJDBC], 
>>> PTOperator[id=11,name=ConsoleErrorJDBC],
>>> PTOperator[id=9,name=cloutApiBanksOutput], 
>>> PTOperator[id=8,name=banksOpLoadObject],
>>> PTOperator[id=3,name=Deduper], PTOperator[id=4,name=
>>> cloutApiBanksInput.outputPort#unifier], PTOperator[id=2,name=
>>> cloutApiBanksInput]]]
>>
>> 2017-06-04 06:43:48,587 [IPC Server handler 1 on 38848] INFO  
>> stram.StreamingContainerManager
>>> processHeartbeat - Container container_1496564390452_0002_01_000010
>>> buffer server: datatorrent-sandbox:35304
>>
>> 2017-06-04 06:43:56,262 [IPC Server handler 16 on 38848] INFO  
>> stram.StreamingContainerParent
>>> log - child msg: Stopped running due to an exception.
>>> java.lang.NullPointerException
>>
>> at com.google.common.base.Preconditions.checkNotNull(
>>> Preconditions.java:187)
>>
>> at org.apache.apex.malhar.lib.wal.FSWindowDataManager.
>>> retrieve(FSWindowDataManager.java:487)
>>
>> at org.apache.apex.malhar.lib.wal.FSWindowDataManager.
>>> retrieve(FSWindowDataManager.java:448)
>>
>> at com.datatorrent.lib.db.jdbc.AbstractJdbcPollInputOperator.replay(
>>> AbstractJdbcPollInputOperator.java:316)
>>
>> at com.datatorrent.lib.db.jdbc.AbstractJdbcPollInputOperator.beginWindow(
>>> AbstractJdbcPollInputOperator.java:203)
>>
>> at com.datatorrent.stram.engine.InputNode.run(InputNode.java:122)
>>
>> at com.datatorrent.stram.engine.StreamingContainer$2.run(
>>> StreamingContainer.java:1441)
>>
>>  context: PTContainer[id=1(container_1496564390452_0002_01_000010),
>>> state=ACTIVE,operators=[PTOperator[id=6,name=ConsoleNew],
>>> PTOperator[id=7,name=ConsoleBad], PTOperator[id=1,name=cloutApiBanksInput],
>>> PTOperator[id=5,name=banksDeduplicator], 
>>> PTOperator[id=10,name=ConsoleNewJDBC],
>>> PTOperator[id=11,name=ConsoleErrorJDBC], 
>>> PTOperator[id=9,name=cloutApiBanksOutput],
>>> PTOperator[id=8,name=banksOpLoadObject], PTOperator[id=3,name=Deduper],
>>> PTOperator[id=4,name=cloutApiBanksInput.outputPort#unifier],
>>> PTOperator[id=2,name=cloutApiBanksInput]]]
>>
>> 2017-06-04 06:43:56,683 [IPC Server handler 5 on 38848] WARN  
>> stram.StreamingContainerManager
>>> processOperatorFailure - Operator failure: 
>>> PTOperator[id=2,name=cloutApiBanksInput]
>>> count: 6
>>
>> 2017-06-04 06:43:56,683 [IPC Server handler 5 on 38848] ERROR 
>> stram.StreamingContainerManager
>>> processOperatorFailure - Initiating container restart after operator
>>> failure PTOperator[id=2,name=cloutApiBanksInput]
>>
>> 2017-06-04 06:43:57,292 [main] INFO  stram.StreamingAppMasterService
>>> sendContainerAskToRM - Requested stop container
>>> container_1496564390452_0002_01_000010
>>
>> 2017-06-04 06:43:57,292 
>> [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl
>>> #7] INFO  impl.NMClientAsyncImpl run - Processing Event EventType:
>>> STOP_CONTAINER for Container container_1496564390452_0002_01_000010
>>
>> 2017-06-04 06:43:57,294 
>> [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl
>>> #7] INFO  impl.ContainerManagementProtocolProxy newProxy - Opening
>>> proxy : localhost:8052
>>
>> 2017-06-04 06:43:59,301 [main] INFO  stram.StreamingAppMasterService
>>> execute - Completed containerId=container_1496564390452_0002_01_000010,
>>> state=COMPLETE, exitStatus=-105, diagnostics=Container killed by the
>>> ApplicationMaster.
>>
>> Container killed on request. Exit code is 143
>>
>> Container exited with a non-zero exit code 143
>>
>> 2017-06-04 06:43:59,301 [main] INFO  stram.StreamingContainerManager
>>> scheduleContainerRestart - Initiating recovery for
>>> container_1496564390452_0002_01_000010@localhost:8052
>>
>> 2017-06-04 06:43:59,302 [main] INFO  stram.StreamingContainerManager
>>> scheduleContainerRestart - Affected operators 
>>> [PTOperator[id=6,name=ConsoleNew],
>>> PTOperator[id=7,name=ConsoleBad], PTOperator[id=1,name=cloutApiBanksInput],
>>> PTOperator[id=4,name=cloutApiBanksInput.outputPort#unifier],
>>> PTOperator[id=3,name=Deduper], PTOperator[id=5,name=banksDeduplicator],
>>> PTOperator[id=8,name=banksOpLoadObject], 
>>> PTOperator[id=9,name=cloutApiBanksOutput],
>>> PTOperator[id=11,name=ConsoleErrorJDBC], 
>>> PTOperator[id=10,name=ConsoleNewJDBC],
>>> PTOperator[id=2,name=cloutApiBanksInput]]
>>
>> 2017-06-04 06:44:00,334 [main] INFO  stram.ResourceRequestHandler getHost
>>> - Strict anti-affinity = [] for container with operators
>>> PTOperator[id=6,name=ConsoleNew],PTOperator[id=7,
>>> name=ConsoleBad],PTOperator[id=1,name=cloutApiBanksInput],
>>> PTOperator[id=5,name=banksDeduplicator],PTOperator[
>>> id=10,name=ConsoleNewJDBC],PTOperator[id=11,name=
>>> ConsoleErrorJDBC],PTOperator[id=9,name=cloutApiBanksOutput]
>>> ,PTOperator[id=8,name=banksOpLoadObject],PTOperator[
>>> id=3,name=Deduper],PTOperator[id=4,name=cloutApiBanksInput.
>>> outputPort#unifier],PTOperator[id=2,name=cloutApiBanksInput]
>>
>> 2017-06-04 06:44:00,334 [main] INFO  stram.ResourceRequestHandler getHost
>>> - Found host null
>>
>> 2017-06-04 06:44:01,341 [main] INFO  stram.StreamingAppMasterService
>>> execute - Got new container., 
>>> containerId=container_1496564390452_0002_01_000011,
>>> containerNode=localhost:8052, containerNodeURI=localhost:8042,
>>> containerResourceMemory6144, priority9
>>
>> 2017-06-04 06:44:01,341 [main] INFO  stram.StreamingContainerManager
>>> assignContainer - Removing container agent container_1496564390452_0002_
>>> 01_000010
>>
>> 2017-06-04 06:44:01,342 [main] INFO  stram.LaunchContainerRunnable run -
>>> Setting up container launch context for containerid=container_
>>> 1496564390452_0002_01_000011
>>
>>
> --
> *Guilherme Hott*
> *Software Engineer*
> Skype: guilhermehott
> @guilhermehott
> https://www.linkedin.com/in/guilhermehott
>
>

Reply via email to