[jira] [Commented] (SPARK-31686) Return of String instead of array in function get_json_object
[ https://issues.apache.org/jira/browse/SPARK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108024#comment-17108024 ] Hyukjin Kwon commented on SPARK-31686: -- Please don't reopen the JIRA. {{get_json_object}} doesn't now if the contents are JSON array or object before the actual execution. Spark is lazy so Spark should know the type before the execution. If you know it, you can use other expressions such as {{from_json}}. > Return of String instead of array in function get_json_object > - > > Key: SPARK-31686 > URL: https://issues.apache.org/jira/browse/SPARK-31686 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.5 > Environment: {code:json} > // code placeholder > { > customer:{ > addesses:[ { {code} > location : arizona > } > ] > } > } > get_json_object(string(customer),'$addresses[*].location') > return "arizona" > result expected should be > ["arizona"] >Reporter: Touopi Touopi >Priority: Major > > when we selecting a node of a json object that is array, > When the array contains One element , the get_json_object return a String > with " characters instead of an array of One element. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31686) Return of String instead of array in function get_json_object
[ https://issues.apache.org/jira/browse/SPARK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107344#comment-17107344 ] Touopi Touopi commented on SPARK-31686: --- I don't really understand the purpose to change the return type. {code:sql} select v1.brandedcustomernumber as brandedcustomernumber from uniquecustomer.UniqueCustomer lateral view explode(from_json(get_json_object(string(brandedCustomerInfoAggregate), '$.brandedCustomers[*].customerNumber'), 'array')) v1 as brandedcustomernumber {code} Look this example, Since i am using the wilcard [*] it means that i can have 0..n elements returned. Lucky my brandedCustomerInfoAggregate object has more than one brandedCustomers elements so the result of the get_json_object function will be ["customer1","customer2"] for instance. So now the function explode is waiting an array,what will happens if in any case i have just one brandedCustomers filled ? the Object like String (actually i discover the " characters added on the chain) will be return liked this "customer1" an the function from_json will break. I am expecting that during the parsing and selection of node if we have [*] we should return an array. Actually when One element is returned for another query,i am converting to array and cast to string (from_json(cast(array(get_json_object(string(customer),'$.addresses[*].location')) as string),'array')) But the result are not good when more elements are returned > Return of String instead of array in function get_json_object > - > > Key: SPARK-31686 > URL: https://issues.apache.org/jira/browse/SPARK-31686 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.5 > Environment: {code:json} > // code placeholder > { > customer:{ > addesses:[ { {code} > location : arizona > } > ] > } > } > get_json_object(string(customer),'$addresses[*].location') > return "arizona" > result expected should be > ["arizona"] >Reporter: Touopi Touopi >Priority: Major > > when we selecting a node of a json object that is array, > When the array contains One element , the get_json_object return a String > with " characters instead of an array of One element. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31686) Return of String instead of array in function get_json_object
[ https://issues.apache.org/jira/browse/SPARK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106084#comment-17106084 ] Hyukjin Kwon commented on SPARK-31686: -- Yes, you don't know the output type before actually parsing. The type should be known before the execution. It's by design > Return of String instead of array in function get_json_object > - > > Key: SPARK-31686 > URL: https://issues.apache.org/jira/browse/SPARK-31686 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.5 > Environment: {code:json} > // code placeholder > { > customer:{ > addesses:[ { {code} > location : arizona > } > ] > } > } > get_json_object(string(customer),'$addresses[*].location') > return "arizona" > result expected should be > ["arizona"] >Reporter: Touopi Touopi >Priority: Major > > when we selecting a node of a json object that is array, > When the array contains One element , the get_json_object return a String > with " characters instead of an array of One element. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31686) Return of String instead of array in function get_json_object
[ https://issues.apache.org/jira/browse/SPARK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105939#comment-17105939 ] daile commented on SPARK-31686: --- [~bruneltouopi] looks like it was specifically removed {code:java} val buf = buffer.getBuffer if (dirty > 1) { g.writeRawValue(buf.toString) } else if (dirty == 1) { // remove outer array tokens g.writeRawValue(buf.substring(1, buf.length()-1)) } // else do not write anything {code} > Return of String instead of array in function get_json_object > - > > Key: SPARK-31686 > URL: https://issues.apache.org/jira/browse/SPARK-31686 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.5 > Environment: {code:json} > // code placeholder > { > customer:{ > addesses:[ { {code} > location : arizona > } > ] > } > } > get_json_object(string(customer),'$addresses[*].location') > return "arizona" > result expected should be > ["arizona"] >Reporter: Touopi Touopi >Priority: Major > > when we selecting a node of a json object that is array, > When the array contains One element , the get_json_object return a String > with " characters instead of an array of One element. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org