I was testing out the spark thrift jdbc server by running a simple query in
the beeline client. The spark itself is running on a yarn cluster.

However, when I run a query in beeline -> I see no running jobs in the
spark UI(completely empty) and the yarn UI seem to indicate that the
submitted query is being run as a map reduce job. This is probably also
being indicated from the spark logs but I am not completely sure:

2014-11-11 00:19:00,492 INFO  ql.Context
(Context.java:getMRScratchDir(267)) - New scratch dir is
hdfs://xxxxxxxx:9000/tmp/hive-ubuntu/hive_2014-11-11_00-19-00_367_3847629323646885865-1

2014-11-11 00:19:00,877 INFO  ql.Context
(Context.java:getMRScratchDir(267)) - New scratch dir is
hdfs://xxxxxxxx:9000/tmp/hive-ubuntu/hive_2014-11-11_00-19-00_367_3847629323646885865-2

2014-11-11 00:19:04,152 INFO  ql.Context
(Context.java:getMRScratchDir(267)) - New scratch dir is
hdfs://xxxxxxxx:9000/tmp/hive-ubuntu/hive_2014-11-11_00-19-00_367_3847629323646885865-2

2014-11-11 00:19:04,425 INFO  Configuration.deprecation
(Configuration.java:warnOnceIfDeprecated(1009)) - mapred.submit.replication
is deprecated. Instead, use mapreduce.client.submit.file.replication

2014-11-11 00:19:04,516 INFO  client.RMProxy
(RMProxy.java:createRMProxy(92)) - Connecting to ResourceManager
at xxxxxxxx:8032

2014-11-11 00:19:04,607 INFO  client.RMProxy
(RMProxy.java:createRMProxy(92)) - Connecting to ResourceManager
at xxxxxxxx:8032

2014-11-11 00:19:04,639 WARN  mapreduce.JobSubmitter
(JobSubmitter.java:copyAndConfigureFiles(150)) - Hadoop command-line option
parsing not performed. Implement the Tool interface and execute your
application with ToolRunner to remedy this

2014-11-11 00:00:08,806 INFO  input.FileInputFormat
(FileInputFormat.java:listStatus(287)) - Total input paths to process :
14912

2014-11-11 00:00:08,864 INFO  lzo.GPLNativeCodeLoader
(GPLNativeCodeLoader.java:<clinit>(34)) - Loaded native gpl library

2014-11-11 00:00:08,866 INFO  lzo.LzoCodec (LzoCodec.java:<clinit>(76)) -
Successfully loaded & initialized native-lzo library [hadoop-lzo rev
8e266e052e423af592871e2dfe09d54c03f6a0e8]

2014-11-11 00:00:09,873 INFO  input.CombineFileInputFormat
(CombineFileInputFormat.java:createSplits(413)) - DEBUG: Terminated node
allocation with : CompletedNodes: 1, size left: 194541317

2014-11-11 00:00:10,017 INFO  mapreduce.JobSubmitter
(JobSubmitter.java:submitJobInternal(396)) - number of splits:615

2014-11-11 00:00:10,095 INFO  mapreduce.JobSubmitter
(JobSubmitter.java:printTokens(479)) - Submitting tokens for job:
job_1414084656759_0115

2014-11-11 00:00:10,241 INFO  impl.YarnClientImpl
(YarnClientImpl.java:submitApplication(167)) - Submitted application
application_1414084656759_0115


It seems like the query is being run as a hive query instead of spark
query. The same query works fine when run from spark-sql cli.

Reply via email to