Re: Having issue with Spark SQL JDBC on hive table !!!

@Sanjiv Singh Thu, 28 Jan 2016 09:28:37 -0800

Adding to it

job status at UI :


Stage IdDescriptionSubmittedDurationTasks: Succeeded/TotalInputOutputShuffle
ReadShuffle Write
1 select ename from employeetest(kill
<http://impetus-d951centos:4040/stages/stage/kill?id=1&terminate=true>)collect
at SparkPlan.scala:84
<http://impetus-d951centos:4040/stages/stage?id=1&attempt=0>+details

2016/01/29 04:20:06 3.0 min
0/2

Getting below exception on Spark UI :

org.apache.spark.rdd.RDD.collect(RDD.scala:813)
org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:84)
org.apache.spark.sql.DataFrame.collect(DataFrame.scala:887)
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:178)
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:233)
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:344)
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:744)


Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Thu, Jan 28, 2016 at 9:57 PM, @Sanjiv Singh <sanjiv.is...@gmail.com>
wrote:

> Any help on this.
>
> Regards
> Sanjiv Singh
> Mob :  +091 9990-447-339
>
> On Wed, Jan 27, 2016 at 10:25 PM, @Sanjiv Singh <sanjiv.is...@gmail.com>
> wrote:
>
>> Hi Ted ,
>> Its typo.
>>
>>
>> Regards
>> Sanjiv Singh
>> Mob :  +091 9990-447-339
>>
>> On Wed, Jan 27, 2016 at 9:13 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> In the last snippet, temptable is shown by 'show tables' command.
>>> Yet you queried tampTable.
>>>
>>> I believe this just was typo :-)
>>>
>>> On Wed, Jan 27, 2016 at 7:07 AM, @Sanjiv Singh <sanjiv.is...@gmail.com>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I have configured Spark to query on hive table.
>>>>
>>>> Run the Thrift JDBC/ODBC server using below command :
>>>>
>>>> *cd $SPARK_HOME*
>>>> *./sbin/start-thriftserver.sh --master spark://myhost:7077 --hiveconf
>>>> hive.server2.thrift.bind.host=myhost --hiveconf
>>>> hive.server2.thrift.port=9999*
>>>>
>>>> and also able to connect through beeline
>>>>
>>>> *beeline>* !connect jdbc:hive2://192.168.145.20:9999
>>>> Enter username for jdbc:hive2://192.168.145.20:9999: root
>>>> Enter password for jdbc:hive2://192.168.145.20:9999: impetus
>>>> *beeline > *
>>>>
>>>> It is not giving query result on hive table through Spark JDBC, but it
>>>> is working with spark HiveSQLContext. See complete scenario explain below.
>>>>
>>>> Help me understand the issue why Spark SQL JDBC is not giving result ?
>>>>
>>>> Below are version details.
>>>>
>>>> *Hive Version      : 1.2.1*
>>>> *Hadoop Version :  2.6.0*
>>>> *Spark version    :  1.3.1*
>>>>
>>>> Let me know if need other details.
>>>>
>>>>
>>>> *Created Hive Table , insert some records and query it :*
>>>>
>>>> *beeline> !connect jdbc:hive2://myhost:10000*
>>>> Enter username for jdbc:hive2://myhost:10000: root
>>>> Enter password for jdbc:hive2://myhost:10000: ******
>>>> *beeline> create table tampTable(id int ,name string ) clustered by
>>>> (id) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true');*
>>>> *beeline> insert into table tampTable values
>>>> (1,'row1'),(2,'row2'),(3,'row3');*
>>>> *beeline> select name from tampTable;*
>>>> name
>>>> ---------
>>>> row1
>>>> row3
>>>> row2
>>>>
>>>> *Query through SparkSQL HiveSQLContext :*
>>>>
>>>>     SparkConf sparkConf = new SparkConf().setAppName("JavaSparkSQL");
>>>>     SparkContext sc = new SparkContext(sparkConf);
>>>>     HiveContext hiveContext = new HiveContext(sc);
>>>> DataFrame teenagers = hiveContext.sql("*SELECT name FROM tampTable*");
>>>> List<String> teenagerNames = teenagers.toJavaRDD().map(new
>>>> Function<Row, String>() {
>>>>  @Override
>>>>  public String call(Row row) {
>>>>  return "Name: " + row.getString(0);
>>>>  }
>>>> }).collect();
>>>> for (String name: teenagerNames) {
>>>>  System.out.println(name);
>>>> }
>>>> teenagers2.toJavaRDD().saveAsTextFile("/tmp1");
>>>> sc.stop();
>>>>
>>>> which is working perfectly and giving all names from table *tempTable*
>>>>
>>>> *Query through Spark SQL JDBC :*
>>>>
>>>> *beeline> !connect jdbc:hive2://myhost:9999*
>>>> Enter username for jdbc:hive2://myhost:9999: root
>>>> Enter password for jdbc:hive2://myhost:9999: ******
>>>> *beeline> show tables;*
>>>> *temptable*
>>>> *..other tables*
>>>> beeline> *SELECT name FROM tampTable;*
>>>>
>>>> I can list the table through "show tables", but I run the query , it is
>>>> either hanged or returns nothing.
>>>>
>>>>
>>>>
>>>> Regards
>>>> Sanjiv Singh
>>>> Mob :  +091 9990-447-339
>>>>
>>>
>>>
>>
>

Re: Having issue with Spark SQL JDBC on hive table !!!

Reply via email to