Re: SparkSQL returns ArrayBuffer for fields of type Array

Michael Armbrust Wed, 27 Aug 2014 18:07:18 -0700

In general the various language interfaces try to return the natural type
for the language.  In python we return lists in scala we return Seqs.
 Arrays on the JVM have all sorts of messy semantics (e.g. they are
invariant and don't have erasure).



On Wed, Aug 27, 2014 at 5:34 PM, Du Li <l...@yahoo-inc.com> wrote:

>   I found this discrepancy when writing unit tests for my project.
> Basically the expectation was that the returned type should match that of
> the input data. Although it’s easy to work around, I was just feeling a bit
> weird. Is there a better reason to return ArrayBuffer?
>
>
>   From: Michael Armbrust <mich...@databricks.com>
> Date: Wednesday, August 27, 2014 at 5:21 PM
> To: Du Li <l...@yahoo-inc.com>
> Cc: "user@spark.apache.org" <user@spark.apache.org>
> Subject: Re: SparkSQL returns ArrayBuffer for fields of type Array
>
>   Arrays in the JVM are also mutable.  However, you should not be relying
> on the exact type here.  The only promise is that you will get back
> something of type Seq[_].
>
>
> On Wed, Aug 27, 2014 at 4:27 PM, Du Li <l...@yahoo-inc.com> wrote:
>
>>  Hi, Michael.
>>
>>  I used HiveContext to create a table with a field of type Array.
>> However, in the hql results, this field was returned as type ArrayBuffer
>> which is mutable. Would it make more sense to be an Array?
>>
>>  The Spark version of my test is 1.0.2. I haven’t tested it on
>> SQLContext nor newer version of Spark yet.
>>
>>  Thanks,
>> Du
>>
>>
>>
>

Re: SparkSQL returns ArrayBuffer for fields of type Array

Reply via email to