[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504409#comment-14504409 ]
pin_zhang commented on SPARK-6923: ---------------------------------- Hi, Michael We run spark app in Spark1.3, and use the CLIService in HiveServer2 to get the table schema, the call stack to get the schema as below HiveMetaStore$HMSHandler.get_fields(String, String) line: 2873 HiveMetaStore$HMSHandler.get_schema(String, String) line: 2946 NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method] NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57 DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 Method.invoke(Object, Object...) line: 606 RetryingHMSHandler.invoke(Object, Method, Object[]) line: 105 $Proxy9.get_schema(String, String) line: not available HiveMetaStoreClient.getSchema(String, String) line: 1269 GetColumnsOperation.run() line: 139 HiveSessionImplwithUGI(HiveSessionImpl).getColumns(String, String, String, String) line: 359 NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method] NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57 DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 Method.invoke(Object, Object...) line: 606 HiveSessionProxy.invoke(Method, Object[]) line: 79 HiveSessionProxy.access$000(HiveSessionProxy, Method, Object[]) line: 37 HiveSessionProxy$1.run() line: 64 AccessController.doPrivileged(PrivilegedExceptionAction<T>, AccessControlContext) line: not available [native method] Subject.doAs(Subject, PrivilegedExceptionAction<T>) line: 415 UserGroupInformation.doAs(PrivilegedExceptionAction<T>) line: 1548 Hadoop23Shims(HadoopShimsSecure).doAs(UserGroupInformation, PrivilegedExceptionAction<T>) line: 493 HiveSessionProxy.invoke(Object, Method, Object[]) line: 60 $Proxy17.getColumns(String, String, String, String) line: not available SparkSQLCLIService(CLIService).getColumns(SessionHandle, String, String, String, String) line: 309 ThriftBinaryCLIService(ThriftCLIService).GetColumns(TGetColumnsReq) line: 433 TCLIService$Processor$GetColumns<I>.getResult(I, GetColumns_args) line: 1433 TCLIService$Processor$GetColumns<I>.getResult(Object, TBase) line: 1418 TCLIService$Processor$GetColumns<I>(ProcessFunction<I,T>).process(int, TProtocol, TProtocol, I) line: 39 TSetIpAddressProcessor<I>(TBaseProcessor<I>).process(TProtocol, TProtocol) line: 39 TSetIpAddressProcessor<I>.process(TProtocol, TProtocol) line: 55 TThreadPoolServer$WorkerProcess.run() line: 206 ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) line: 1145 ThreadPoolExecutor$Worker.run() line: 615 Thread.run() line: 745 Don't you think the method should return the same table schema as that you said hctx.table("tableName").schema? > Get invalid hive table columns after save DataFrame to hive table > ----------------------------------------------------------------- > > Key: SPARK-6923 > URL: https://issues.apache.org/jira/browse/SPARK-6923 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.3.0 > Reporter: pin_zhang > > HiveContext hctx = new HiveContext(sc); > List<String> sample = new ArrayList<String>(); > sample.add( "{\"id\": \"id_1\", \"age\":1}" ); > RDD<String> sampleRDD = new JavaSparkContext(sc).parallelize(sample).rdd(); > DataFrame df = hctx.jsonRDD(sampleRDD); > String table="test"; > df.saveAsTable(table, "json",SaveMode.Overwrite); > Table t = hctx.catalog().client().getTable(table); > System.out.println( t.getCols()); > -------------------------------------------------------------- > With the code above to save DataFrame to hive table, > Get table cols returns one column named 'col' > [FieldSchema(name:col, type:array<string>, comment:from deserializer)] > Expected return fields schema id, age. > This results in the jdbc API cannot retrieves the table columns via ResultSet > DatabaseMetaData.getColumns(String catalog, String schemaPattern,String > tableNamePattern, String columnNamePattern) > But resultset metadata for query " select * from test " contains fields id, > age. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org