[jira] [Commented] (SPARK-31686) Return of String instead of array in function get_json_object

2020-05-15 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108024#comment-17108024
 ] 

Hyukjin Kwon commented on SPARK-31686:
--

Please don't reopen the JIRA. {{get_json_object}} doesn't now if the contents 
are JSON array or object before the actual execution. Spark is lazy so Spark 
should know the type before the execution.
If you know it, you can use other expressions such as {{from_json}}.

> Return of String instead of array in function get_json_object
> -
>
> Key: SPARK-31686
> URL: https://issues.apache.org/jira/browse/SPARK-31686
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.5
> Environment: {code:json}
> // code placeholder
> {
> customer:{ 
>  addesses:[ { {code}
>                   location :  arizona
>                   }
>                ]
> }
> }
>  get_json_object(string(customer),'$addresses[*].location')
> return "arizona"
> result expected should be
> ["arizona"]
>Reporter: Touopi Touopi
>Priority: Major
>
> when we selecting a node of a json object that is array,
> When the array contains One element , the get_json_object return a String 
> with " characters instead of an array of One element.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31686) Return of String instead of array in function get_json_object

2020-05-14 Thread Touopi Touopi (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107344#comment-17107344
 ] 

Touopi Touopi commented on SPARK-31686:
---

I don't really understand the purpose to change the return type.
{code:sql}
select
v1.brandedcustomernumber as brandedcustomernumber
from
uniquecustomer.UniqueCustomer
lateral view 
explode(from_json(get_json_object(string(brandedCustomerInfoAggregate), 
'$.brandedCustomers[*].customerNumber'), 'array')) v1 as 
brandedcustomernumber
{code}
Look this example,
 Since i am using the wilcard [*] it means that i can have 0..n elements 
returned.
 Lucky my brandedCustomerInfoAggregate object has more than one 
brandedCustomers elements so the result of the get_json_object function will be 
["customer1","customer2"] for instance.


 So now the function explode is waiting an array,what will happens if in any 
case i have just one brandedCustomers filled ?

the Object like String (actually i discover the " characters added on the 
chain) will be return liked this "customer1" an the function from_json will 
break.


I am expecting that during the parsing and selection of node if we have [*] we 
should return an array.
 Actually when One element is returned for another query,i am converting to 
array and cast to string
 
(from_json(cast(array(get_json_object(string(customer),'$.addresses[*].location'))
 as string),'array'))

But the result are not good when more elements are returned

> Return of String instead of array in function get_json_object
> -
>
> Key: SPARK-31686
> URL: https://issues.apache.org/jira/browse/SPARK-31686
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.5
> Environment: {code:json}
> // code placeholder
> {
> customer:{ 
>  addesses:[ { {code}
>                   location :  arizona
>                   }
>                ]
> }
> }
>  get_json_object(string(customer),'$addresses[*].location')
> return "arizona"
> result expected should be
> ["arizona"]
>Reporter: Touopi Touopi
>Priority: Major
>
> when we selecting a node of a json object that is array,
> When the array contains One element , the get_json_object return a String 
> with " characters instead of an array of One element.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31686) Return of String instead of array in function get_json_object

2020-05-13 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106084#comment-17106084
 ] 

Hyukjin Kwon commented on SPARK-31686:
--

Yes, you don't know the output type before actually parsing. The type should be 
known before the execution. It's by design

> Return of String instead of array in function get_json_object
> -
>
> Key: SPARK-31686
> URL: https://issues.apache.org/jira/browse/SPARK-31686
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.5
> Environment: {code:json}
> // code placeholder
> {
> customer:{ 
>  addesses:[ { {code}
>                   location :  arizona
>                   }
>                ]
> }
> }
>  get_json_object(string(customer),'$addresses[*].location')
> return "arizona"
> result expected should be
> ["arizona"]
>Reporter: Touopi Touopi
>Priority: Major
>
> when we selecting a node of a json object that is array,
> When the array contains One element , the get_json_object return a String 
> with " characters instead of an array of One element.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31686) Return of String instead of array in function get_json_object

2020-05-12 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105939#comment-17105939
 ] 

daile commented on SPARK-31686:
---

[~bruneltouopi] looks like it was specifically removed
{code:java}
val buf = buffer.getBuffer
if (dirty > 1) {
  g.writeRawValue(buf.toString)
} else if (dirty == 1) {
  // remove outer array tokens
  g.writeRawValue(buf.substring(1, buf.length()-1))
} // else do not write anything
{code}

> Return of String instead of array in function get_json_object
> -
>
> Key: SPARK-31686
> URL: https://issues.apache.org/jira/browse/SPARK-31686
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.5
> Environment: {code:json}
> // code placeholder
> {
> customer:{ 
>  addesses:[ { {code}
>                   location :  arizona
>                   }
>                ]
> }
> }
>  get_json_object(string(customer),'$addresses[*].location')
> return "arizona"
> result expected should be
> ["arizona"]
>Reporter: Touopi Touopi
>Priority: Major
>
> when we selecting a node of a json object that is array,
> When the array contains One element , the get_json_object return a String 
> with " characters instead of an array of One element.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org