hi,all:
I want to read a json file and search it by sql .
the data struct should be :
bid: string (nullable = true)
code: string (nullable = true)
and the json file data should be like :
     {bid":"MzI4MTI5MzcyNw==","code":"罗甸网警"}
     {"bid":"MzI3MzQ5Nzc2Nw==","code":"西早君"}
but in fact my json file data is :
    {"bizs":[ 
{bid":"MzI4MTI5MzcyNw==","code":"罗甸网警"},{"bid":"MzI3MzQ5Nzc2Nw==","code":"西早君"}]}
    {"bizs":[ 
{bid":"MzI4MTI5Mzcy00==","code":"罗甸网警"},{"bid":"MzI3MzQ5Nzc201==","code":"西早君"}]}

I load it by spark ,data schema shows like this :
root
 |-- bizs: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- bid: string (nullable = true)
 |    |    |-- code: string (nullable = true)

I can select columns by : df.select("bizs.id","bizs.name")
but the colume values is in array type:
+--------------------+--------------------+
|                  id|                code|
+--------------------+--------------------+
|[4938200, 4938201...|[罗甸网警, 室内设计师杨焰红, ...|
|[4938300, 4938301...|[SDCS十全九美, 旅梦长大, ...|
|[4938400, 4938401...|[日重重工液压行走回转, 氧老家,...|
|[4938500, 4938501...|[PABXSLZ, 陈少燕, 笑蜜...|
|[4938600, 4938601...|[税海微云, 西域美农云家店, 福...|
+--------------------+--------------------+

what I want is I can read colum in normal row type. how I can do it ?

2016-10-19


lk_spark 

Reply via email to