Hi Tariq,

Can you tell in brief what kind of operation you have to do? I can try
helping you out with that.
In general, if you are trying to use any group operations you can use
window operations.

On Wed, Mar 2, 2016 at 6:40 PM, Mohammad Tariq <donta...@gmail.com> wrote:

> Hi Sainath,
>
> Thank you for the prompt response!
>
> Could you please elaborate your answer a bit? I'm sorry I didn't quite get
> this. What kind of operation I can perform using SQLContext? It just helps
> us during things like DF creation, schema application etc, IMHO.
>
>
>
> [image: http://]
>
> Tariq, Mohammad
> about.me/mti
> [image: http://]
> <http://about.me/mti>
>
>
> On Thu, Mar 3, 2016 at 4:59 AM, Sainath Palla <pallasain...@gmail.com>
> wrote:
>
>> Instead of collecting the data frame, you can try using a sqlContext on
>> the data frame. But it depends on what kind of operations are you trying to
>> perform.
>>
>> On Wed, Mar 2, 2016 at 6:21 PM, Mohammad Tariq <donta...@gmail.com>
>> wrote:
>>
>>> Hi list,
>>>
>>> *Scenario :*
>>> I am creating a DStream by reading an Avro object from a Kafka topic and
>>> then converting it into a DataFrame to perform some operations on the data.
>>> I call DataFrame.collect() and perform the intended operation on each Row
>>> of Array[Row] returned by DataFrame.collect().
>>>
>>> *Problem : *
>>> Calling DataFrame.collect() changes the schema of the underlying record,
>>> thus making it impossible to get the columns by index(as the order gets
>>> changed).
>>>
>>> *Query :*
>>> Is it the way DataFrame.collect() behaves or am I doing something wrong
>>> here? In former case is there any way I can maintain the schema while
>>> getting each Row?
>>>
>>> Any pointers/suggestions would be really helpful. Many thanks!
>>>
>>>
>>> [image: http://]
>>>
>>> Tariq, Mohammad
>>> about.me/mti
>>> [image: http://]
>>> <http://about.me/mti>
>>>
>>>
>>
>>
>

Reply via email to