There is an option called Explode for this .

From: lk_spark [mailto:lk_sp...@163.com]
Sent: Wednesday, October 19, 2016 9:06 AM
To: user.spark
Subject: how to extract arraytype data to file

hi,all:
I want to read a json file and search it by sql .
the data struct should be :
bid: string (nullable = true)
code: string (nullable = true)
and the json file data should be like :
     {bid":"MzI4MTI5MzcyNw==","code":"罗甸网警"}
     {"bid":"MzI3MzQ5Nzc2Nw==","code":"西早君"}
but in fact my json file data is :
    {"bizs":[ 
{bid":"MzI4MTI5MzcyNw==","code":"罗甸网警"},{"bid":"MzI3MzQ5Nzc2Nw==","code":"西早君"}]}
    {"bizs":[ 
{bid":"MzI4MTI5Mzcy00==","code":"罗甸网警"},{"bid":"MzI3MzQ5Nzc201==","code":"西早君"}]}
I load it by spark ,data schema shows like this :
root
 |-- bizs: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- bid: string (nullable = true)
 |    |    |-- code: string (nullable = true)

I can select columns by : df.select("bizs.id","bizs.name")
but the colume values is in array type:
+--------------------+--------------------+
|                  id|                code|
+--------------------+--------------------+
|[4938200, 4938201...|[罗甸网警, 室内设计师杨焰红, ...|
|[4938300, 4938301...|[SDCS十全九美, 旅梦长大, ...|
|[4938400, 4938401...|[日重重工液压行走回转, 氧老家,...|
|[4938500, 4938501...|[PABXSLZ, 陈少燕, 笑蜜...|
|[4938600, 4938601...|[税海微云, 西域美农云家店, 福...|
+--------------------+--------------------+

what I want is I can read colum in normal row type. how I can do it ?
2016-10-19
________________________________
lk_spark

----------------------------------------------------------------------
This message and any attachments are intended only for the use of the addressee 
and may contain information that is privileged and confidential. If the reader 
of the message is not the intended recipient or an authorized representative of 
the intended recipient, you are hereby notified that any dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, notify the sender immediately by return email and delete the message 
and any attachments from your system.

Reply via email to