Jacob, please ignore previous message. It looks like this is caused by a config 
on our end.

Thanks for your input in resolving our issues.


Jeremiah Adams
Software Engineer
www.helixeducation.com
Blog | Twitter | Facebook | LinkedIn

________________________________________
From: Jeremiah Adams
Sent: Wednesday, August 16, 2017 12:27 PM
To: dev@samza.apache.org
Subject: Re: Issue with TopicExistsException in 0.13.0

Jacob,

We are getting closer. The job will attempt to run before falling over. The 
logs show it going through VerifiableProperties and connecting to the brokers 
and quietly restarting. We see it restart several times before giving up. 
Here's logs around the area. You can see Connected followed by Disconnected.

2017-08-16 18:18:39 VerifiableProperties [INFO] Verifying properties
2017-08-16 18:18:39 VerifiableProperties [INFO] Property auto.offset.reset is 
overridden to smallest
2017-08-16 18:18:39 VerifiableProperties [INFO] Property client.id is 
overridden to samza_admin-inquiry_submission-1
2017-08-16 18:18:39 VerifiableProperties [INFO] Property group.id is overridden 
to undefined-samza-consumer-group-b07af039-edeb-44f6-8fb5-545dc68c49ca
2017-08-16 18:18:39 VerifiableProperties [INFO] Property socket.timeout.ms is 
overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Property zookeeper.connect is 
overridden to zookeeper-0.devhelix.com:2181
2017-08-16 18:18:39 VerifiableProperties [INFO] Property 
zookeeper.connection.timeout.ms is overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Property 
zookeeper.session.timeout.ms is overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Verifying properties
2017-08-16 18:18:39 VerifiableProperties [INFO] Property auto.offset.reset is 
overridden to smallest
2017-08-16 18:18:39 VerifiableProperties [INFO] Property client.id is 
overridden to samza_admin-inquiry_submission-1
2017-08-16 18:18:39 VerifiableProperties [INFO] Property group.id is overridden 
to undefined-samza-consumer-group-7c5a69e9-58f7-45a9-b5c8-972f8c5ef088
2017-08-16 18:18:39 VerifiableProperties [INFO] Property socket.timeout.ms is 
overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Property zookeeper.connect is 
overridden to zookeeper-0.devhelix.com:2181
2017-08-16 18:18:39 VerifiableProperties [INFO] Property 
zookeeper.connection.timeout.ms is overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Property 
zookeeper.session.timeout.ms is overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Verifying properties
2017-08-16 18:18:39 VerifiableProperties [INFO] Property client.id is 
overridden to samza_admin-inquiry_submission-1
2017-08-16 18:18:39 VerifiableProperties [INFO] Property metadata.broker.list 
is overridden to 10.201.10.222:9092,10.201.9.163:9092
2017-08-16 18:18:39 VerifiableProperties [INFO] Property request.timeout.ms is 
overridden to 60000
2017-08-16 18:18:39 ClientUtils$ [INFO] Fetching metadata from broker 
BrokerEndPoint(1,10.201.9.163,9092) with correlation id 0 for 1 topic(s) 
Set(porter-inquirySubmission)
2017-08-16 18:18:39 SyncProducer [INFO] Connected to 10.201.9.163:9092 for 
producing
2017-08-16 18:18:39 SyncProducer [INFO] Disconnecting from 10.201.9.163:9092
2017-08-16 18:18:39 VerifiableProperties [INFO] Verifying properties
2017-08-16 18:18:39 VerifiableProperties [INFO] Property auto.offset.reset is 
overridden to smallest
2017-08-16 18:18:39 VerifiableProperties [INFO] Property client.id is 
overridden to samza_consumer-inquiry_submission-1
2017-08-16 18:18:39 VerifiableProperties [INFO] Property group.id is overridden 
to undefined-samza-consumer-group-0787ba05-ff14-43ca-a2a5-5873fcc04068
2017-08-16 18:18:39 VerifiableProperties [INFO] Property socket.timeout.ms is 
overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Property zookeeper.connect is 
overridden to zookeeper-0.devhelix.com:2181
2017-08-16 18:18:39 VerifiableProperties [INFO] Property 
zookeeper.connection.timeout.ms is overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Property 
zookeeper.session.timeout.ms is overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Verifying properties
2017-08-16 18:18:39 VerifiableProperties [INFO] Property auto.offset.reset is 
overridden to smallest
2017-08-16 18:18:39 VerifiableProperties [INFO] Property client.id is 
overridden to samza_admin-inquiry_submission-1
2017-08-16 18:18:39 VerifiableProperties [INFO] Property group.id is overridden 
to undefined-samza-consumer-group-a4e57ce6-b2a8-42ae-ad2f-5d4a79212add
2017-08-16 18:18:39 VerifiableProperties [INFO] Property socket.timeout.ms is 
overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Property zookeeper.connect is 
overridden to zookeeper-0.devhelix.com:2181
2017-08-16 18:18:39 VerifiableProperties [INFO] Property 
zookeeper.connection.timeout.ms is overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Property 
zookeeper.session.timeout.ms is overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Verifying properties
2017-08-16 18:18:39 VerifiableProperties [INFO] Property auto.offset.reset is 
overridden to smallest
2017-08-16 18:18:39 VerifiableProperties [INFO] Property client.id is 
overridden to samza_checkpoint_manager-inquiry_submission-1
2017-08-16 18:18:39 VerifiableProperties [INFO] Property group.id is overridden 
to undefined-samza-consumer-group-6acd09e8-0979-4835-a76b-ab045b641536
2017-08-16 18:18:39 VerifiableProperties [INFO] Property socket.timeout.ms is 
overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Property zookeeper.connect is 
overridden to zookeeper-0.devhelix.com:2181
2017-08-16 18:18:39 VerifiableProperties [INFO] Property 
zookeeper.connection.timeout.ms is overridden to 60000
2017-08-16 18:18:39 VerifiableProperties [INFO] Property 
zookeeper.session.timeout.ms is overridden to 60000
2017-08-16 18:18:40 ZkEventThread [INFO] Starting ZkClient event thread.
2017-08-16 18:18:40 ZkClient [INFO] Waiting for keeper state SyncConnected
2017-08-16 18:18:40 ZkClient [INFO] zookeeper state changed (SyncConnected)
2017-08-16 18:18:40 ZkEventThread [INFO] Terminate ZkClient event thread.
2017-08-16 18:18:40 VerifiableProperties [INFO] Verifying properties
2017-08-16 18:18:40 VerifiableProperties [INFO] Property client.id is 
overridden to samza_checkpoint_manager-inquiry_submission-1
2017-08-16 18:18:40 VerifiableProperties [INFO] Property metadata.broker.list 
is overridden to 10.201.10.222:9092,10.201.9.163:9092
2017-08-16 18:18:40 VerifiableProperties [INFO] Property request.timeout.ms is 
overridden to 60000
2017-08-16 18:18:40 ClientUtils$ [INFO] Fetching metadata from broker 
BrokerEndPoint(0,10.201.10.222,9092) with correlation id 0 for 1 topic(s) 
Set(__samza_checkpoint_ver_1_for_inquiry-submission_1)
2017-08-16 18:18:40 SyncProducer [INFO] Connected to 10.201.10.222:9092 for 
producing
2017-08-16 18:18:40 SyncProducer [INFO] Disconnecting from 10.201.10.222:9092


Jeremiah Adams
Software Engineer
www.helixeducation.com
Blog | Twitter | Facebook | LinkedIn

________________________________________
From: Jacob Maes <jacob.m...@gmail.com>
Sent: Tuesday, August 15, 2017 9:23 AM
To: dev@samza.apache.org
Subject: Re: Issue with TopicExistsException in 0.13.0

Hey Jeremiah,

That error would suggest that the version of samza-yarn is older than
0.13.0. the run-am.sh script was renamed to run-jc.sh here:
https://url.serverdata.net/?aZyQRg2CGut2qgyHrdHxA3r2wRZBhFBnHgQFe8bv7-emQZ8AG5EBciytL6M4q2Xswnnv80L-_LuC33mCD7RpCtuZlacb_ur_gIW7F6MA2IKayYdDY3_6lnmDg8jmr15B_nux56EaUHnS1v2G6dEbkZLpanQVaMLTs0TSK0aPRFk0~

Is it possible some samza jars are getting cached during deployment?

-Jake

On Tue, Aug 15, 2017 at 7:31 AM, Jeremiah Adams <jad...@helixeducation.com>
wrote:

> Thanks Jacob, the job is getting a bit further now but am seeing a
> different issue now.
>
> The job fails and never moves into 'running'. The job looks to be
> launching correctly:
>
> [10.201.11.64] out: 13:48:59.450 [IPC Client (2052489518) connection to
> porter-samza-1.porter.int/127.0.0.1:8032 from centos] DEBUG
> org.apache.hadoop.ipc.Client - IPC Client (2052489518) connection to
> porter-samza-1.porter.int/127.0.0.1:8032 from centos got value #3
> [10.201.11.64] out: 13:48:59.451 [main] DEBUG 
> org.apache.hadoop.ipc.ProtobufRpcEngine
> - Call: getApplicationReport took 2ms
> [10.201.11.64] out: 13:48:59.452 [main] INFO
> org.apache.samza.job.JobRunner - job started successfully - Running
> [10.201.11.64] out: 13:48:59.452 [main] INFO
> org.apache.samza.job.JobRunner - exiting
>
>
> When I dig into the userlogs, the job never moves from the starting
> container, stderr contains:
>
> [centos@porter-yarn-slave-1 container_1502753192195_0007_02_000001]$ more
> stderr
> /bin/bash: /tmp/hadoop-centos/nm-local-dir/usercache/centos/appcache/
> application_1502753192195_0007/container_1502753192195_
> 0007_02_000001/__package/bin/run-am.sh: No such file or directory
>
> When I poke at the directory structure, the directory is empty at
> appcache/ and filecache/ both:
>
> [centos@porter-yarn-slave-1 container_1502753192195_0007_02_000001]$ ls
> /tmp/hadoop-centos/nm-local-dir/usercache/centos/appcache/
> [centos@porter-yarn-slave-1 container_1502753192195_0007_02_000001]$
>
>
>
>
> Jeremiah Adams
> Software Engineer
> https://url.serverdata.net/?ahfhEufaAWbezBrUFPG98ZJcterGfIerU3ZwsA3Gv_C0~
> Blog | Twitter | Facebook | LinkedIn
>
> ________________________________________
> From: Jacob Maes <jacob.m...@gmail.com>
> Sent: Monday, August 14, 2017 3:12 PM
> To: dev@samza.apache.org
> Subject: Re: Issue with TopicExistsException in 0.13.0
>
> Correction, the exception seems to have moved between kafka version
> 0.10.0.1 and 0.10.1.1
>
> Here's the patch that changed both the kafka version and the import
> statement for TopicExistsException:
> https://url.serverdata.net/?aZyQRg2CGut2qgyHrdHxA3r2wRZBhF
> BnHgQFe8bv7-emnODgdhciwPkVKB_BE-ZnZmhwA18Q7rimVruRFx5g0vsvC9cG
> t2jrAYfAucx0goYepLp8ZyfPAPxCv0Xh9CQVXTrqVMnByrbWTNcczkXashg2
> zljIWFPYiRKbG_5H2BvM~
>
> So, you'll want to be using kafka 0.10.1.1.
>
> On Mon, Aug 14, 2017 at 2:00 PM, Jacob Maes <jacob.m...@gmail.com> wrote:
>
> > Hey Jeremiah,
> >
> > It looks like the TopicExistsException should be handled by the system
> > admin and not rethrown:
> > https://url.serverdata.net/?aZyQRg2CGut2qgyHrdHxA3r2wRZBhFBnHgQFe8bv7-
> eli7bCaPPi9BUx7SPWnrBZJsWvG7fAvAkJZWsHy8YrwNKbg0eJOFg9N9UDBA
> B2ODwZOGu2TuRvoZ9NyWbJmDt_g
> > b84b20ffd2/samza-kafka/src/main/scala/org/apache/samza/
> > system/kafka/KafkaSystemAdmin.scala#L442
> >
> > I have a theory what's happening here. I think the TopicExistsException
> > was moved from the org.apache.kafka.common package in kafka 0.8.2
> > https://url.serverdata.net/?aGYQUT2PfoZ_Oed64B3A9noxqDhLnbYFqBHw3jimnO
> 5vi3F8i7RsxdGks87OLmlvVSbRBbvJOT8rWW0hz_3vOmg~~
> > common/TopicExistsException.html
> >
> > to the org.apache.kafka.common.errors package in kafka 0.10
> > https://url.serverdata.net/?atT2ehXMhI-BK13fx1xs1ts_Kf81VsaPrd-
> NHf6sUGn2ecNA4kUI3dYoA0607M-H1sV2xtByyu3eJSKvz3Cecre4DPAtt
> j3Qs9n_BrkW6lDT8Xt-ACWGgEYMDI0JoIyzV
> > TopicExistsException.html
> >
> > And Samza 0.13 expects the latter.
> >
> > Can you double check that your job is actually using kafka 0.10.1.1,
> > perhaps by inspecting the jars?
> >
> > -Jake
> >
> > On Mon, Aug 14, 2017 at 11:55 AM, Jeremiah Adams <
> > jad...@helixeducation.com> wrote:
> >
> >> I am having an issue with topic creation after updating dependencies. I
> >> bumped samza dependencies from scala 2.10 v 0.10.1 to  scala 2.11 0.13.0
> >> and org.apache.kafka dependency from kafka_2.10 0.8.1 to kafka_2.11
> >> 0.10.1.1.
> >> I am seeing an error that the topic already exists and the job gets
> stuck
> >> in a loop with logs like below. The job will not move into 'accepted'
> state
> >> in yarn and never consumes the topics it should be consuming. The zk,
> yarn
> >> and kafka nodes are newly deployed. I'm at a loss, any ideas?
> >>
> >>
> >> [10.201.9.105] out: 17:18:49.347 [main] DEBUG
> >> org.apache.samza.system.kafka.KafkaSystemAdmin - Exception detail:
> >> [10.201.9.105] out: kafka.common.TopicExistsException: Topic
> >> "__samza_coordinator_inquiry-submission_1" already exists.
> >> [10.201.9.105] out: at kafka.admin.AdminUtils$.create
> >> OrUpdateTopicPartitionAssignmentPathInZK(AdminUtils.scala:420)
> >> [10.201.9.105] out: at kafka.admin.AdminUtils$.create
> >> Topic(AdminUtils.scala:404)
> >> [10.201.9.105] out: at org.apache.samza.system.kafka.
> >> KafkaSystemAdmin$$anonfun$createStream$1.apply(KafkaSystemAd
> >> min.scala:425)
> >> [10.201.9.105] out: at org.apache.samza.system.kafka.
> >> KafkaSystemAdmin$$anonfun$createStream$1.apply(KafkaSystemAd
> >> min.scala:422)
> >> [10.201.9.105] out: at org.apache.samza.util.Exponent
> >> ialSleepStrategy.run(ExponentialSleepStrategy.scala:82)
> >> [10.201.9.105] out: at org.apache.samza.system.kafka.
> >> KafkaSystemAdmin.createStream(KafkaSystemAdmin.scala:421)
> >> [10.201.9.105] out: at org.apache.samza.system.kafka.
> >> KafkaSystemAdmin.createCoordinatorStream(KafkaSystemAdmin.scala:336)
> >> [10.201.9.105] out: at org.apache.samza.job.JobRunner
> >> .run(JobRunner.scala:88)
> >> [10.201.9.105] out: at org.apache.samza.job.JobRunner
> >> $.doOperation(JobRunner.scala:52)
> >> [10.201.9.105] out: at org.apache.samza.job.JobRunner
> >> $.main(JobRunner.scala:47)
> >> [10.201.9.105] out: at org.apache.samza.job.JobRunner
> >> .main(JobRunner.scala)
> >> [10.201.9.105] out: 17:18:49.347 [main-SendThread(ip-10-201-9-2
> >> 43.us-west-2.compute.internal:2181)] DEBUG org.apache.zookeeper.
> ClientCnxn
> >> - An exception was thrown while closing send thread for session
> >> 0x25de16b1f500013 : Unable to read additional data from server sessionid
> >> 0x25de16b1f500013, likely server has closed socket
> >> [10.201.9.105] out: 17:18:49.349 [main-EventThread] INFO
> >> org.apache.zookeeper.ClientCnxn - EventThread shut down?
> >>
> >>
> >>
> >> Jeremiah Adams
> >> Software Engineer
> >> https://url.serverdata.net/?ahfhEufaAWbezBrUFPG98ZJcterGfI
> erU3ZwsA3Gv_C0~<https://url.serverdata.net/?a49H2rNGIIBtQOw6md8OcHp-
> qKE3Xn2gNiZ3dlqAeSDA~>
> >> Blog<https://url.serverdata.net/?a49H2rNGIIBtQOw6md8OcHgFEZu-
> KYuiu8doY66NWwmmyWxz7kC-27Yfnbdgd2wyh5gjXUa6LMT_NRXsj1g1VVg~~> | Twitter<
> >> https://url.serverdata.net/?a0Q7ct5_6cOdbJ86kpWB0zx6RbtgugTVC7lU_
> W7za50jLdZQGpLgVlR1V06zckSaM5oOKb6QBo46Qp9xt0Tt7Aw~~> | Facebook<
> https://url.serverdata.net/?aAmyAO_nS_C1aDgBLeKyGTt253c4xO8jY2FEj4eUKEJA~.
> >> com/HelixEducation> | LinkedIn<https://url.serverdata.net/?aanlcNI-
> cN74Gdz-TD332xAl6lHu7TRNICWoHUFjYf-KlBjrCGHoYR65b3rl-
> OyW10nWFv6hwYvUSoVHL4b3vGA~~>
> >>
> >
> >
>

Reply via email to