Hi,
I'm trying to integrate pig with cassandra.
My columnfamily in cassandra is
name -> xxx
Age -> yyy
class -> zzz
This is how I load data
rows =LOAD 'cassandra://TestKeySpace/TestPig' USING CassandraStorage()
as (key,columns:bag{column:tuple(name,value)});
Now I wish to perform group by based on value of class. I tried
col_values = FOREACH rows GENERATE (columns.value) as list:bag{};
This gave me the result in following Schema :bag(:tuple(chararray))
Ex: on dump col_values i got {(xxx),(yyy),(zzz)}
Now if I try to access
list = FOREACH col_values GENERATE (list.$0, list.$1);
I'm getting undefined index access error. Like
list.$1 doesn't exist :bag[:tuple(chararray)] has only one column [But
there are 3]
How can i access tuple wise data in such cases?
I couldn't perform group by based on 1 column because of this.
I tried TOTUPLE but the problem is, it converts the entire bag a tuple
and applies group by on that.
Help me out
Regards,
Tamil