Re: sparkContext won't stop when using spark-kudu

Darren Hoo Mon, 14 Mar 2016 21:00:59 -0700

Hi Dan,

Here's my enviroment:


CDH Version:  5.5.0-1.cdh5.5.0.p0.8
Kudu Version:  0.7.1-1.kudu0.7.1.p0.36

Steps to reproduce:


*1. create the kudu table:*

CREATE TABLE t1 (
  id bigint
)
TBLPROPERTIES(
  'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler',
  'kudu.table_name' = 't1',
  'kudu.master_addresses' = 'master1:7051,master2:7051',
  'kudu.key_columns' = 'id',
  'kudu.num_tablet_replicas' = '5'
)

*2. insert some values:*

insert into t1 values (1),(2),(3),(4),(5);

*3. start the spark-shell*

$ spark-shell --jars
lib/interface-annotations-0.7.1.jar,lib/kudu-client-0.7.1.jar,lib/kudu-mapreduce-0.7.1.jar,lib/kudu-spark-0.7.1.jar


>  sqlContext.read
     .format("org.kududb.spark")
     .options(Map("kudu.table" -> "t1", "kudu.master" ->
"master1:7051,master2:7051"))
     .load()
     .registerTempTable("t1")

>  sqlContext.sql("select id from t1").count


*4. exit spark-shell with Cltrl-D*

> Ctrl-D

when the spark-shell is shutting down, finally it shows:
16/03/15 11:48:17 INFO RemoteActorRefProvider$RemotingTerminator: Remoting
shut down.

then the process hangs for ever until Ctrl-C is pressed.

I don't have to do cleanup for sqlContext manually, right?

the stack dump is attached.


On Tue, Mar 15, 2016 at 9:56 AM, Dan Burkert <d...@cloudera.com> wrote:

> Hi Darren,
>
> I think the thread dump would be helpful.  We have a very similar test in
> the repository, and we haven't had any problems with that.  What
> environment are you running the job in?
>
> - Dan
>
> On Mon, Mar 14, 2016 at 8:20 AM, Darren Hoo <darren....@gmail.com> wrote:
>
>> I use sqlContext to register the kudu table
>>
>> sqlContext.read
>>   .format("org.kududb.spark")
>>   .options(Map("kudu.table" -> table, "kudu.master" -> kuduMaster))
>>   .load()
>>   .registerTempTable(table)
>>
>> then do something query and processing
>>
>>      sqlContext.sql("...")
>>
>>
>> but after sc.stop() is called, the spark driver never exit:
>>
>> 16/03/14 22:54:51 INFO DAGScheduler: Stopping DAGScheduler
>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Shutting down all
>> executors
>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Interrupting monitor
>> thread
>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Asking each
>> executor to shut down
>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Stopped
>> 16/03/14 22:54:51 INFO MapOutputTrackerMasterEndpoint:
>> MapOutputTrackerMasterEndpoint stopped!
>> 16/03/14 22:54:51 INFO MemoryStore: MemoryStore cleared
>> 16/03/14 22:54:51 INFO BlockManager: BlockManager stopped
>> 16/03/14 22:54:51 INFO BlockManagerMaster: BlockManagerMaster stopped
>> 16/03/14 22:54:51 INFO SparkContext: Successfully stopped SparkContext
>> 16/03/14 22:54:51 INFO
>> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
>> OutputCommitCoordinator stopped!
>> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
>> Shutting down remote daemon.
>> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
>> Remote daemon shut down; proceeding with flushing remote transports.
>> 16/03/14 22:54:51 INFO Remoting: Remoting shut down
>> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
>> Remoting shut down.
>>
>> then it stuck here forever
>>
>> PS: I use spark on yarn client mode, the problem only occurs when I
>> use kudu-spark
>>
>> I have the thread dump about 7k after gzipped, I can post here if asked.
>>
>
>

stacks.txt.gz
Description: GNU Zip compressed data

Re: sparkContext won't stop when using spark-kudu

Reply via email to