Re: Problem reading from KafkaConsoleProducer topic using Apex..

Sushil Apex Fri, 05 May 2017 05:12:45 -0700

thanks @vikram, now this is running in actual yarn

container log has nothing


greping the application id in node manager logs is below

2017-05-05 17:26:49,822 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
Creating a new application reference for app application_1493983487923_0001
2017-05-05 17:26:49,840 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Application application_1493983487923_0001 transitioned from NEW to INITING
2017-05-05 17:26:49,895 INFO
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera
IP=127.0.0.1    OPERATION=Start Container Request
TARGET=ContainerManageImpl      RESULT=SUCCESS
 APPID=application_1493983487923_0001
 CONTAINERID=container_1493983487923_0001_01_000001
2017-05-05 17:26:52,333 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Adding container_1493983487923_0001_01_000001 to application
application_1493983487923_0001
2017-05-05 17:26:52,339 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Application application_1493983487923_0001 transitioned from INITING to
RUNNING
2017-05-05 17:26:52,360 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got
event CONTAINER_INIT for appId application_1493983487923_0001
2017-05-05 17:26:52,471 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying
from /yarn/nm/nmPrivate/container_1493983487923_0001_01_000001.tokens to
/yarn/nm/usercache/cloudera/appcache/application_1493983487923_0001/container_1493983487923_0001_01_000001.tokens
2017-05-05 17:26:52,471 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Localizer CWD set to
/yarn/nm/usercache/cloudera/appcache/application_1493983487923_0001 =
file:/yarn/nm/usercache/cloudera/appcache/application_1493983487923_0001
2017-05-05 17:26:57,697 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
launchContainer: [bash,
/yarn/nm/usercache/cloudera/appcache/application_1493983487923_0001/container_1493983487923_0001_01_000001/default_container_executor.sh]
2017-05-05 17:27:22,592 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Adding container_1493983487923_0001_01_000002 to application
application_1493983487923_0001
2017-05-05 17:27:22,592 INFO
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera
IP=127.0.0.1    OPERATION=Start Container Request
TARGET=ContainerManageImpl      RESULT=SUCCESS
 APPID=application_1493983487923_0001
 CONTAINERID=container_1493983487923_0001_01_000002
2017-05-05 17:27:22,597 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got
event CONTAINER_INIT for appId application_1493983487923_0001
2017-05-05 17:27:22,743 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
launchContainer: [bash,
/yarn/nm/usercache/cloudera/appcache/application_1493983487923_0001/container_1493983487923_0001_01_000002/default_container_executor.sh]



resourceMnaager logs

2017-05-05 17:26:45,749 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=cloudera
IP=127.0.0.1    OPERATION=Submit Application Request
 TARGET=ClientRMService  RESULT=SUCCESS
 APPID=application_1493983487923_0001
2017-05-05 17:26:45,749 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing
application with id application_1493983487923_0001
2017-05-05 17:26:46,290 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1493983487923_0001 State change from NEW to NEW_SAVING
2017-05-05 17:26:46,290 INFO
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
Storing info for app: application_1493983487923_0001
2017-05-05 17:26:46,565 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1493983487923_0001 State change from NEW_SAVING to SUBMITTED
2017-05-05 17:26:46,603 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Accepted application application_1493983487923_0001 from user: cloudera, in
queue: default, currently num of applications: 1
2017-05-05 17:26:46,609 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1493983487923_0001 State change from SUBMITTED to ACCEPTED
2017-05-05 17:26:47,569 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=cloudera
OPERATION=AM Allocated Container        TARGET=SchedulerApp
RESULT=SUCCESS  APPID=application_1493983487923_0001
 CONTAINERID=container_1493983487923_0001_01_000001
2017-05-05 17:26:47,789 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
Storing attempt: AppId: application_1493983487923_0001 AttemptId:
appattempt_1493983487923_0001_000001 MasterContainer: Container:
[ContainerId: container_1493983487923_0001_01_000001, NodeId:
quickstart.cloudera:8041, NodeHttpAddress: quickstart.cloudera:8042,
Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind:
ContainerToken, service: 127.0.0.1:8041 }, ]
2017-05-05 17:26:48,253 INFO
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
Command to launch container container_1493983487923_0001_01_000001 :
${JAVA_HOME}/bin/java -Djava.io.tmpdir=$PWD/tmp -Xmx768m
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/dt-heap-1.bin
-Dhadoop.root.logger=INFO,RFA -Dhadoop.log.dir=<LOG_DIR>
-Ddt.attr.APPLICATION_PATH=hdfs://quickstart.cloudera:8020/user/cloudera/datatorrent/apps/application_1493983487923_0001
com.datatorrent.stram.StreamingAppMaster 1><LOG_DIR>/AppMaster.stdout
2><LOG_DIR>/AppMaster.stderr
2017-05-05 17:27:18,526 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1493983487923_0001 State change from ACCEPTED to RUNNING
2017-05-05 17:27:18,526 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=cloudera
IP=127.0.0.1    OPERATION=Register App Master
TARGET=ApplicationMasterService RESULT=SUCCESS
 APPID=application_1493983487923_0001
 APPATTEMPTID=appattempt_1493983487923_0001_000001
2017-05-05 17:27:20,197 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=cloudera
OPERATION=AM Allocated Container        TARGET=SchedulerApp
RESULT=SUCCESS  APPID=application_1493983487923_0001
 CONTAINERID=container_1493983487923_0001_01_000002
2017-05-05 17:27:20,203 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt:
Making reservation: node=quickstart.cloudera
app_id=application_1493983487923_0001
2017-05-05 17:27:20,205 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode:
Reserved container container_1493983487923_0001_01_000003 on node host:
quickstart.cloudera:8041 #containers=2 available=1024 used=2048 for
application application_1493983487923_0001
2017-05-05 17:27:52,911 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt:
Application application_1493983487923_0001 unreserved  on node host:
quickstart.cloudera:8041 #containers=2 available=1024 used=2048, currently
has 0 at priority 1; currentReservation <memory:0, vCores:0>




On Fri, May 5, 2017 at 5:11 PM, Vikram Patil <[email protected]> wrote:

> Actually you are launching app in local mode.
>
> Instead of
> launch -local /home/cloudera/myapexapp-1.0-SNAPSHOT.apa
>
> launch as:
> launch  /home/cloudera/myapexapp-1.0-SNAPSHOT.apa
>
> Then it should launch an application using Yarn.
>
> Thanks & Regards,
> Vikram
>
> On Fri, May 5, 2017 at 5:04 PM, Sushil Apex <[email protected]>
> wrote:
>
>> @tushar
>>
>> I have used apex-cli in one terminal and launched my application with
>> below command, in another terminal, i checked for application id but it is
>> coming zero active applications, I must be missing some thing here.
>>
>>
>>
>> apex> launch -local /home/cloudera/myapexapp-1.0-SNAPSHOT.apa
>>   1. Kafka2HDFS2
>>   2. MyFirstApplication
>> Choose application: 1
>>
>>
>>
>> Apex CLI 3.5.0 06.12.2016 @ 22:11:51 PST rev: 6de8828 branch:
>> 6de8828e4f3d5734d0a6f9c1be0aa7057cb60ac8
>> apex> list-apps *
>> {"apps": []}
>> 0 active, total 0 applications.
>>
>>
>> @chaitanya
>> hdfs and yarn are running, but the container folder is blank
>>
>> [cloudera@quickstart bin]$ cd /var/log/hadoop-yarn/container
>> [cloudera@quickstart container]$ ls
>> [cloudera@quickstart container]$
>>
>>
>>
>> On Fri, May 5, 2017 at 4:41 PM, Tushar Gosavi <[email protected]>
>> wrote:
>>
>>> If log aggregations is enabled you can get the logs using yarn logs
>>> command. you can get the application id using apex-cli.
>>>
>>> - Tushar.
>>>
>>>
>>> On Fri, May 5, 2017 at 4:33 PM, Sushil Apex <[email protected]>
>>> wrote:
>>>
>>>> To add here
>>>> I am using Cloudera VM for hdfs and running Kafka from apache with
>>>> single machine zookeeper
>>>>
>>>> On Fri, May 5, 2017 at 4:32 PM, Sushil Apex <[email protected]>
>>>> wrote:
>>>>
>>>>> @vikram, I am using apex cli
>>>>> not sure where ts logs will be generated, checked /var/log/hadoop-yarn
>>>>> but did found any updated logs for apex
>>>>>
>>>>> I have changed the application name but that didn't helped
>>>>>
>>>>> On Fri, May 5, 2017 at 4:03 PM, vikram patil <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Can you share your Apex logs?  Have you launched an application using
>>>>>> apex cli?
>>>>>>
>>>>>> Also, can you try once by changing name of your application and
>>>>>> relaunch again. Please kill or shut down your existing running
>>>>>> application before this.
>>>>>>
>>>>>> -Vikram
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, May 5, 2017 at 3:56 PM, Sushil Apex <[email protected]>
>>>>>> wrote:
>>>>>> > Tried with EARLIEST option also, no luck :(
>>>>>> >
>>>>>> > On Fri, May 5, 2017 at 3:49 PM, Sushil Apex <[email protected]>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> @Vikram the folder /tmp/FromKafka is already created.
>>>>>> >> @Chaitanya yes I am pusing messages in kafka topic using
>>>>>> consoleProducer
>>>>>> >>  bin/kafka-console-producer.sh --broker-list localhost:9092 --topic
>>>>>> >> kafka2hdfs
>>>>>> >>
>>>>>> >> @Ajay
>>>>>> >> will try with the initial offset to EARLIEST.
>>>>>> >>
>>>>>> >> Thank you
>>>>>> >> Sushil
>>>>>> >>
>>>>>> >> On Fri, May 5, 2017 at 2:54 PM, AJAY GUPTA <[email protected]>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>> Also, you may be required to specify the initial offset to
>>>>>> EARLIEST.
>>>>>> >>>
>>>>>> >>> Ajay
>>>>>> >>>
>>>>>> >>> On Fri, May 5, 2017 at 2:35 PM, vikram patil <
>>>>>> [email protected]>
>>>>>> >>> wrote:
>>>>>> >>>>
>>>>>> >>>> Hi Sushil,
>>>>>> >>>>
>>>>>> >>>> Have you provided configuration specifying hdfs directory and
>>>>>> file for
>>>>>> >>>> an application?
>>>>>> >>>> You may have to create /tmp/fromKafka directory in hdfs.
>>>>>> >>>> Thanks & Regards,
>>>>>> >>>> Vikram
>>>>>> >>>>
>>>>>> >>>> On Fri, May 5, 2017 at 2:30 PM, Sushil Apex <
>>>>>> [email protected]>
>>>>>> >>>> wrote:
>>>>>> >>>> >
>>>>>> >>>> > I am using Apex 3.5.0 and kafka 0.9
>>>>>> >>>> > malhar library used is 3.6.0
>>>>>> >>>> >
>>>>>> >>>> > I am following the example from
>>>>>> >>>> > https://github.com/DataTorrent/examples/blob/master/tutorial
>>>>>> s/kafka/src/main/java/com/example/myapexapp/KafkaApp.java
>>>>>> >>>> >
>>>>>> >>>> > I am putting messages in Kafka topic using consolekafkaProducer
>>>>>> >>>> > provided by Kafka, but I am not able to read these messages in
>>>>>> Apex
>>>>>> >>>> > DAG(created based on above link).
>>>>>> >>>> >
>>>>>> >>>> > I am running the apex apa file through apex-cli
>>>>>> >>>> >
>>>>>> >>>> > Apex CLI 3.5.0 06.12.2016 @ 22:11:51 PST rev: 6de8828 branch:
>>>>>> >>>> > 6de8828e4f3d5734d0a6f9c1be0aa7057cb60ac8
>>>>>> >>>> > apex> launch --local /home/cloudera/myapexapp-1.0-SNAPSHOT.apa
>>>>>> >>>> >   1. Kafka2HDFS
>>>>>> >>>> >   2. MyFirstApplication
>>>>>> >>>> > Choose application: 1
>>>>>> >>>> >
>>>>>> >>>> > and nothing happens after this, I can see the messages put in
>>>>>> >>>> > consoleConsumer in kafka logs
>>>>>> >>>> >
>>>>>> >>>> > Properties used are
>>>>>> >>>> >
>>>>>> >>>> > <property>
>>>>>> >>>> >   <name>dt.operator.kafkaIn.prop.topics</name>
>>>>>> >>>> >   <value>kafka2hdfs</value>
>>>>>> >>>> > </property>
>>>>>> >>>> >
>>>>>> >>>> > <property>
>>>>>> >>>> >   <name>dt.operator.kafkaIn.prop.consumer.zookeeper</name>
>>>>>> >>>> >   <value>localhost:2181</value>
>>>>>> >>>> > </property>
>>>>>> >>>> > <property>
>>>>>> >>>> >   <name>dt.operator.kafkaIn.prop.clusters</name>
>>>>>> >>>> >   <value>localhost:9092</value>
>>>>>> >>>> > </property>
>>>>>> >>>> > <property>
>>>>>> >>>> >   <name>dt.operator.kafkaIn.prop.initialPartitionCount</name>
>>>>>> >>>> >   <value>1</value>
>>>>>> >>>> > </property>
>>>>>> >>>> >
>>>>>> >>>> >
>>>>>> >>>> > Application Code
>>>>>> >>>> >
>>>>>> >>>> > KafkaSinglePortByteArrayInputOperator in
>>>>>> >>>> >                 = dag.addOperator("kafkaIn", new
>>>>>> >>>> > KafkaSinglePortByteArrayInputOperator());
>>>>>> >>>> >
>>>>>> >>>> >         LineOutputOperator out = dag.addOperator("fileOut", new
>>>>>> >>>> > LineOutputOperator());
>>>>>> >>>> >
>>>>>> >>>> >         dag.addStream("dataf", in.outputPort, out.input);
>>>>>> >>>> >
>>>>>> >>>> >
>>>>>> >>>> > I am not able to understand what config I am missing here?
>>>>>> >>>
>>>>>> >>>
>>>>>> >>
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Problem reading from KafkaConsoleProducer topic using Apex..

Reply via email to