Then it looks like something wrong with the python process. Do you run it
in yarn-cluster mode or yarn-client mode ?
Try to add the following line to for yarn-client mode or for yarn-cluster mode

And try it again, this time you will get more log info, I suspect the
python process fail to start

Manuel Sopena Ballesteros <> 于2019年10月4日周五 上午9:09写道:

> Sorry for the late response,
> Yes, I have successfully ran few simple scala codes using %spark
> interpreter in zeppelin.
> What should I do next?
> Manuel
> *From:* Jeff Zhang []
> *Sent:* Tuesday, October 1, 2019 5:44 PM
> *To:* users
> *Subject:* Re: thrift.transport.TTransportException
> It looks like you are using pyspark, could you try just start scala spark
> interpreter via `%spark` ? First let's figure out whether it is related
> with pyspark.
> Manuel Sopena Ballesteros <> 于2019年10月1日周二 下午3:29
> 写道:
> Dear Zeppelin community,
> I would like to ask for advice in regards an error I am having with thrift.
> I am getting quite a lot of these errors while running my notebooks
> org.apache.thrift.transport.TTransportException at
> at org.apache.thrift.transport.TTransport.readAll( at
> org.apache.thrift.protocol.TBinaryProtocol.readAll(
> at
> org.apache.thrift.protocol.TBinaryProtocol.readI32(
> at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(
> at org.apache.thrift.TServiceClient.receiveBase( at
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(
> at
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter$
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter$
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(
> at org.apache.zeppelin.notebook.Paragraph.jobRun( at
> at
> org.apache.zeppelin.scheduler.RemoteScheduler$
> at java.util.concurrent.Executors$
> at at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(
> at
> java.util.concurrent.ThreadPoolExecutor$
> at
> And this is the Spark driver application logs:
> …
> ===============================================================================
> YARN executor launch context:
>   env:
>     CLASSPATH ->
> {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>$HADOOP_CONF_DIR<CPS>/usr/hdp/*<CPS>/usr/hdp/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__
> hdfs://gl-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1568954689585_0052
>     SPARK_USER -> mansop
> /usr/hdp/current/spark2-client/python/lib/<CPS>{{PWD}}/<CPS>{{PWD}}/
>   command:
> LD_LIBRARY_PATH="/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:$LD_LIBRARY_PATH"
> \
>       {{JAVA_HOME}}/bin/java \
>       -server \
>       -Xmx1024m \
>       '-XX:+UseNUMA' \
>{{PWD}}/tmp \
>       '-Dspark.history.ui.port=18081' \
><LOG_DIR> \
>       -XX:OnOutOfMemoryError='kill %p' \
>       org.apache.spark.executor.CoarseGrainedExecutorBackend \
>       --driver-url \
>       spark://coarsegrainedschedu...@r640-1-12-mlx.mlx:35602 \
>       --executor-id \
>       <executorId> \
>       --hostname \
>       <hostname> \
>       --cores \
>       1 \
>       --app-id \
>       application_1568954689585_0052 \
>       --user-class-path \
>       file:$PWD/__app__.jar \
>       1><LOG_DIR>/stdout \
>       2><LOG_DIR>/stderr
>   resources:
>     __app__.jar -> resource { scheme: "hdfs" host: "gl-hdp-ctrl01-mlx.mlx"
> port: 8020 file:
> "/user/mansop/.sparkStaging/application_1568954689585_0052/spark-interpreter-"
> } size: 20433040 timestamp: 1569804142906 type: FILE visibility: PRIVATE
>     __spark_conf__ -> resource { scheme: "hdfs" host:
> "gl-hdp-ctrl01-mlx.mlx" port: 8020 file:
> "/user/mansop/.sparkStaging/application_1568954689585_0052/"
> } size: 277725 timestamp: 1569804143239 type: ARCHIVE visibility: PRIVATE
>     sparkr -> resource { scheme: "hdfs" host: "gl-hdp-ctrl01-mlx.mlx"
> port: 8020 file:
> "/user/mansop/.sparkStaging/application_1568954689585_0052/" }
> size: 688255 timestamp: 1569804142991 type: ARCHIVE visibility: PRIVATE
> -> resource { scheme: "hdfs" host:
> "gl-hdp-ctrl01-mlx.mlx" port: 8020 file:
> "/user/mansop/.sparkStaging/application_1568954689585_0052/"
> } size: 1018 timestamp: 1569804142955 type: FILE visibility: PRIVATE
> -> resource { scheme: "hdfs" host: "gl-hdp-ctrl01-mlx.mlx"
> port: 8020 file:
> "/user/mansop/.sparkStaging/application_1568954689585_0052/" }
> size: 550570 timestamp: 1569804143018 type: FILE visibility: PRIVATE
>     __spark_libs__ -> resource { scheme: "hdfs" host:
> "gl-hdp-ctrl01-mlx.mlx" port: 8020 file:
> "/hdp/apps/" } size:
> 280293050 timestamp: 1568938921259 type: ARCHIVE visibility: PUBLIC
> -> resource { scheme: "hdfs" host:
> "gl-hdp-ctrl01-mlx.mlx" port: 8020 file:
> "/user/mansop/.sparkStaging/application_1568954689585_0052/"
> } size: 42437 timestamp: 1569804143043 type: FILE visibility: PRIVATE
>     __hive_libs__ -> resource { scheme: "hdfs" host:
> "gl-hdp-ctrl01-mlx.mlx" port: 8020 file:
> "/hdp/apps/" } size:
> 43807162 timestamp: 1568938925069 type: ARCHIVE visibility: PUBLIC
> ===============================================================================
> INFO [2019-09-30 10:42:37,303] ({main}[newProxyInstance]:133)
> - Connecting to ResourceManager at gl-hdp-ctrl03-mlx.mlx/
> INFO [2019-09-30 10:42:37,324] ({main} Logging.scala[logInfo]:54) -
> Registering the ApplicationMaster
> INFO [2019-09-30 10:42:37,454] ({main}
>[getConfResourceAsInputStream]:2756) - found resource
> resource-types.xml at file:/etc/hadoop/
> INFO [2019-09-30 10:42:37,470] ({main} Logging.scala[logInfo]:54) - Will
> request 2 executor container(s), each with 1 core(s) and 1408 MB memory
> (including 384 MB of overhead)
> INFO [2019-09-30 10:42:37,474] ({dispatcher-event-loop-14}
> Logging.scala[logInfo]:54) - ApplicationMaster registered as
> NettyRpcEndpointRef(spark://yar...@r640-1-12-mlx.mlx:35602)
> INFO [2019-09-30 10:42:37,485] ({main} Logging.scala[logInfo]:54) -
> Submitted 2 unlocalized container requests.
> INFO [2019-09-30 10:42:37,518] ({main} Logging.scala[logInfo]:54) -
> Started progress reporter thread with (heartbeat : 3000, initial allocation
> : 200) intervals
> INFO [2019-09-30 10:42:37,619] ({Reporter} Logging.scala[logInfo]:54) -
> Launching container container_e01_1568954689585_0052_01_000002 on host
> r640-1-12-mlx.mlx for executor with ID 1
> INFO [2019-09-30 10:42:37,621] ({Reporter} Logging.scala[logInfo]:54) -
> Launching container container_e01_1568954689585_0052_01_000003 on host
> r640-1-13-mlx.mlx for executor with ID 2
> INFO [2019-09-30 10:42:37,623] ({Reporter} Logging.scala[logInfo]:54) -
> Received 2 containers from YARN, launching executors on 2 of them.
> INFO [2019-09-30 10:42:39,481] ({dispatcher-event-loop-51}
> Logging.scala[logInfo]:54) - Registered executor
> NettyRpcEndpointRef(spark-client://Executor) ( with ID 1
> INFO [2019-09-30 10:42:39,553] ({dispatcher-event-loop-62}
> Logging.scala[logInfo]:54) - Registering block manager
> r640-1-12-mlx.mlx:33043 with 408.9 MB RAM, BlockManagerId(1,
> r640-1-12-mlx.mlx, 33043, None)
> INFO [2019-09-30 10:42:40,003] ({dispatcher-event-loop-9}
> Logging.scala[logInfo]:54) - Registered executor
> NettyRpcEndpointRef(spark-client://Executor) ( with ID 2
> INFO [2019-09-30 10:42:40,023] ({pool-6-thread-2}
> Logging.scala[logInfo]:54) - SchedulerBackend is ready for scheduling
> beginning after reached minRegisteredResourcesRatio: 0.8
> INFO [2019-09-30 10:42:40,025] ({pool-6-thread-2}
> Logging.scala[logInfo]:54) - YarnClusterScheduler.postStartHook done
> INFO [2019-09-30 10:42:40,072] ({dispatcher-event-loop-11}
> Logging.scala[logInfo]:54) - Registering block manager
> r640-1-13-mlx.mlx:34105 with 408.9 MB RAM, BlockManagerId(2,
> r640-1-13-mlx.mlx, 34105, None)
> INFO [2019-09-30 10:42:41,779] ({pool-6-thread-2}
>[loadShims]:54) - Initializing shims for Spark 2.x
> INFO [2019-09-30 10:42:41,840] ({pool-6-thread-2}
>[createGatewayServer]:44) - Launching GatewayServer at
> INFO [2019-09-30 10:42:41,852] ({pool-6-thread-2}
>[createGatewayServerAndStartScript]:265) -
> pythonExec: /home/mansop/anaconda2/bin/python
> INFO [2019-09-30 10:42:41,862] ({pool-6-thread-2}
>[setupPySparkEnv]:236) - PYTHONPATH:
> /usr/hdp/current/spark2-client/python/lib/
> ERROR [2019-09-30 10:43:09,061] ({SIGTERM handler}
> SignalUtils.scala[apply$mcZ$sp]:43) - RECEIVED SIGNAL TERM
> INFO [2019-09-30 10:43:09,068] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - Invoking stop() from shutdown hook
> INFO [2019-09-30 10:43:09,082] ({shutdown-hook-0}
>[doStop]:318) - Stopped Spark@505439b3
> {HTTP/1.1,[http/1.1]}{}
> INFO [2019-09-30 10:43:09,085] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - Stopped Spark web UI at
> http://r640-1-12-mlx.mlx:42446
> INFO [2019-09-30 10:43:09,140] ({dispatcher-event-loop-52}
> Logging.scala[logInfo]:54) - Driver requested a total number of 0
> executor(s).
> INFO [2019-09-30 10:43:09,142] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - Shutting down all executors
> INFO [2019-09-30 10:43:09,144] ({dispatcher-event-loop-51}
> Logging.scala[logInfo]:54) - Asking each executor to shut down
> INFO [2019-09-30 10:43:09,151] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - Stopping SchedulerExtensionServices
> (serviceOption=None,
> services=List(),
> started=false)
> ERROR [2019-09-30 10:43:09,155] ({Reporter} Logging.scala[logError]:91) -
> Exception from Reporter thread.
> org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException:
> Application attempt appattempt_1568954689585_0052_000001 doesn't exist in
> ApplicationMasterService cache.
>                at
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(
>                at
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(
>                at
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(
>                at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>                at org.apache.hadoop.ipc.RPC$
>                at org.apache.hadoop.ipc.Server$
>                at org.apache.hadoop.ipc.Server$
>                at
> Method)
>                at
>                at
>                at
> org.apache.hadoop.ipc.Server$
>                at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>                at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(
>                at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
>                at
> java.lang.reflect.Constructor.newInstance(
>                at
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(
>                at
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateYarnException(
>                at
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(
>                at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(
>                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>                at
> sun.reflect.NativeMethodAccessorImpl.invoke(
>                at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>                at java.lang.reflect.Method.invoke(
>                at
>                at
>                at
>                at
>                at
>                at com.sun.proxy.$Proxy21.allocate(Unknown Source)
>                at
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(
>                at
> org.apache.spark.deploy.yarn.YarnAllocator.allocateResources(YarnAllocator.scala:268)
>                at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$
> Caused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException):
> Application attempt appattempt_1568954689585_0052_000001 doesn't exist in
> ApplicationMasterService cache.
>                at
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(
>                at
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(
>                at
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(
>                at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>                at org.apache.hadoop.ipc.RPC$
>                at org.apache.hadoop.ipc.Server$
>                at org.apache.hadoop.ipc.Server$
>                at
> Method)
>                at
>                at
>                at
> org.apache.hadoop.ipc.Server$
>                at
> org.apache.hadoop.ipc.Client.getRpcResponse(
>                at
>                at
>                at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>                at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>                at com.sun.proxy.$Proxy20.allocate(Unknown Source)
>                at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(
>                ... 13 more
> INFO [2019-09-30 10:43:09,164] ({Reporter} Logging.scala[logInfo]:54) -
> Final app status: FAILED, exitCode: 12, (reason: Application attempt
> appattempt_1568954689585_0052_000001 doesn't exist in
> ApplicationMasterService cache.
>                at
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(
>                at
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(
>                at
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(
>                at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>                at org.apache.hadoop.ipc.RPC$
>                at org.apache.hadoop.ipc.Server$
>                at org.apache.hadoop.ipc.Server$
>                at
> Method)
>                at
>                at
>                at
> org.apache.hadoop.ipc.Server$
> )
> INFO [2019-09-30 10:43:09,166] ({dispatcher-event-loop-54}
> Logging.scala[logInfo]:54) - MapOutputTrackerMasterEndpoint stopped!
> INFO [2019-09-30 10:43:09,236] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - MemoryStore cleared
> INFO [2019-09-30 10:43:09,237] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - BlockManager stopped
> INFO [2019-09-30 10:43:09,237] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - BlockManagerMaster stopped
> INFO [2019-09-30 10:43:09,241] ({dispatcher-event-loop-73}
> Logging.scala[logInfo]:54) - OutputCommitCoordinator stopped!
> INFO [2019-09-30 10:43:09,252] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - Successfully stopped SparkContext
> INFO [2019-09-30 10:43:09,253] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - Shutdown hook called
> INFO [2019-09-30 10:43:09,254] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - Deleting directory
> /d1/hadoop/yarn/local/usercache/mansop/appcache/application_1568954689585_0052/spark-ba80cda3-812a-4cf0-b1f6-6e9eb52952b2
> INFO [2019-09-30 10:43:09,254] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - Deleting directory
> /d0/hadoop/yarn/local/usercache/mansop/appcache/application_1568954689585_0052/spark-43078781-8f1c-4cd6-a8da-e81b32892cf8
> INFO [2019-09-30 10:43:09,255] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - Deleting directory
> /d0/hadoop/yarn/local/usercache/mansop/appcache/application_1568954689585_0052/spark-43078781-8f1c-4cd6-a8da-e81b32892cf8/pyspark-9138f7ad-3f15-42c6-9bf3-e3e72d5d4086
> How can I continue troubleshooting in order to find out what this error
> means?
> Thank you very much
> Please consider the environment before printing this email. This message
> and any attachments are intended for the addressee named and may contain
> legally privileged/confidential/copyright information. If you are not the
> intended recipient, you should not read, use, disclose, copy or distribute
> this communication. If you have received this message in error please
> notify us at once by return email and then delete both messages. We accept
> no liability for the distribution of viruses or similar in electronic
> communications. This notice should not be removed.
> --
> Best Regards
> Jeff Zhang
> Please consider the environment before printing this email. This message
> and any attachments are intended for the addressee named and may contain
> legally privileged/confidential/copyright information. If you are not the
> intended recipient, you should not read, use, disclose, copy or distribute
> this communication. If you have received this message in error please
> notify us at once by return email and then delete both messages. We accept
> no liability for the distribution of viruses or similar in electronic
> communications. This notice should not be removed.

Best Regards

Jeff Zhang

Reply via email to