Re: Spark - HiveContext - Unstructured Json

Cheng Lian Tue, 21 Oct 2014 18:12:34 -0700

You can resort to |SQLContext.jsonFile(path: String, samplingRate:Double)| and set |samplingRate| to 1.0, so that all the columns can beinferred.

You can also use |SQLContext.applySchema| to specify your own schema(which is a |StructType|).


On 10/22/14 5:56 AM, Harivardan Jayaraman wrote:

Hi,
I have unstructured JSON as my input which may have extra columns rowto row. I want to store these json rows using HiveContext so that itcan be accessed from the JDBC Thrift Server.I notice there are primarily only two methods available on theSchemaRDD for data - saveAsTable and insertInto. One defines theschema while the other can be used to insert in to the table, butthere is no way to Alter the table and add columns to it.
How do I do this?
One option that I thought of is to write native "CREATE TABLE..." and"ALTER TABLE.." statements but just does not seem feasible because atevery step, I will need to query Hive to determine what is the currentschema and make a decision whether I should add columns to it or not.
Any thoughts? Has anyone been able to do this?

Re: Spark - HiveContext - Unstructured Json

Reply via email to