hi all: More info :
https://issues.apache.org/jira/browse/CASSANDRA-5941 I tried this (and gen. cassandra 1.2.9) but do not work for me, git clone http://git-wip-us.apache.org/repos/asf/cassandra.git cd cassandra git checkout cassandra-1.2 patch -p1 < 5867-bug-fix-filter-push-down-1.2-branch.txt ant Miguel Angel Martín Junquera Analyst Engineer. miguelangel.mar...@brainsins.com 2013/9/2 Miguel Angel Martin junquera <mianmarjun.mailingl...@gmail.com> > *good/nice job !!!* > * > * > * > * > *I'd testing with an udf only with string schema type this is better > and elaborate work..* > * > * > *Regads* > > > Miguel Angel Martín Junquera > Analyst Engineer. > miguelangel.mar...@brainsins.com > > > > 2013/8/31 Chad Johnston <cjohns...@megatome.com> > >> I threw together a quick UDF to work around this issue. It just extracts >> the value portion of the tuple while taking advantage of the CqlStorage >> generated schema to keep the type correct. >> >> You can get it here: https://github.com/iamthechad/cqlstorage-udf >> >> I'll see if I can find more useful information and open a defect, since >> that's what this seems to be. >> >> Chad >> >> >> On Fri, Aug 30, 2013 at 2:02 AM, Miguel Angel Martin junquera < >> mianmarjun.mailingl...@gmail.com> wrote: >> >>> I try this: >>> >>> *rows = LOAD >>> 'cql://keyspace1/test?page_size=1&split_size=4&where_clause=age%3D30' USING >>> CqlStorage();* >>> >>> *dump rows;* >>> >>> *ILLUSTRATE rows;* >>> >>> *describe rows;* >>> >>> * >>> * >>> >>> *values2= FOREACH rows GENERATE TOTUPLE (id) as >>> (mycolumn:tuple(name,value));* >>> >>> *dump values2;* >>> >>> *describe values2;* >>> * >>> * >>> >>> But I get this results: >>> >>> >>> >>> ------------------------------------------------------------- >>> | rows | id:chararray | age:int | title:chararray | >>> ------------------------------------------------------------- >>> | | (id, 6) | (age, 30) | (title, QA) | >>> ------------------------------------------------------------- >>> >>> rows: {id: chararray,age: int,title: chararray} >>> 2013-08-30 09:54:37,831 [main] ERROR org.apache.pig.tools.grunt.Grunt - >>> ERROR 1031: Incompatable field schema: left is >>> "tuple_0:tuple(mycolumn:tuple(name:bytearray,value:bytearray))", right is >>> "org.apache.pig.builtin.totuple_id_1:tuple(id:chararray)" >>> >>> >>> >>> >>> >>> or >>> >>> >>> >>> .... >>> >>> *values2= FOREACH rows GENERATE TOTUPLE (id) ;* >>> *dump values2;* >>> *describe values2;* >>> >>> >>> >>> >>> and the results are: >>> >>> >>> ... >>> (((id,6))) >>> (((id,5))) >>> values2: {org.apache.pig.builtin.totuple_id_8: (id: chararray)} >>> >>> >>> >>> Aggg!!!!! >>> >>> >>> * >>> * >>> >>> >>> >>> Miguel Angel Martín Junquera >>> Analyst Engineer. >>> miguelangel.mar...@brainsins.com >>> >>> >>> >>> 2013/8/26 Miguel Angel Martin junquera <mianmarjun.mailingl...@gmail.com >>> > >>> >>>> hi Chad . >>>> >>>> I have this issue >>>> >>>> I send a mail to user-pig-list and I still i can resolve this, and I >>>> can not access to column values. >>>> In this mail I write some things that I try without results... and >>>> information about this issue. >>>> >>>> >>>> >>>> http://mail-archives.apache.org/mod_mbox/pig-user/201308.mbox/%3ccajeg_hq9s2po3_xytzx5xki4j1mao8q26jydg2wndy_kyiv...@mail.gmail.com%3E >>>> >>>> >>>> >>>> I hope someOne reply one comment, idea or solution about this issue >>>> or bug. >>>> >>>> >>>> I have reviewed the CqlStorage class in code cassandra 1.2.8 but i do >>>> not have configure the environmetn to debug and trace this issue. >>>> >>>> Only I find some comments like, but I do not understand at all. >>>> >>>> >>>> /** >>>> >>>> * A LoadStoreFunc for retrieving data from and storing data to >>>> Cassandra >>>> >>>> * >>>> >>>> * A row from a standard CF will be returned as nested tuples: >>>> >>>> * (((key1, value1), (key2, value2)), ((name1, val1), (name2, val2))). >>>> */ >>>> >>>> >>>> I you found some idea or solution, please post it >>>> >>>> thanks >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> 2013/8/23 Chad Johnston <cjohns...@megatome.com> >>>> >>>>> (I'm using Cassandra 1.2.8 and Pig 0.11.1) >>>>> >>>>> I'm loading some simple data from Cassandra into Pig using CqlStorage. >>>>> The CqlStorage loader defines a Pig schema based on the Cassandra schema, >>>>> but it seems to be wrong. >>>>> >>>>> If I do: >>>>> >>>>> data = LOAD 'cql://bookdata/books' USING CqlStorage(); >>>>> DESCRIBE data; >>>>> >>>>> I get this: >>>>> >>>>> data: {isbn: chararray,bookauthor: chararray,booktitle: >>>>> chararray,publisher: chararray,yearofpublication: int} >>>>> >>>>> However, if I DUMP data, I get results like these: >>>>> >>>>> ((isbn,0425093387),(bookauthor,Georgette Heyer),(booktitle,Death in >>>>> the Stocks),(publisher,Berkley Pub Group),(yearofpublication,1986)) >>>>> >>>>> Clearly the results from Cassandra are key/value pairs, as would be >>>>> expected. I don't know why the schema generated by CqlStorage() would be >>>>> so >>>>> different. >>>>> >>>>> This is really causing me problems trying to access the column values. >>>>> I tried a naive approach of FLATTENing each tuple, then trying to access >>>>> the values that way: >>>>> >>>>> flattened = FOREACH data GENERATE >>>>> FLATTEN(isbn), >>>>> FLATTEN(booktitle), >>>>> ... >>>>> values = FOREACH flattened GENERATE >>>>> $1 AS ISBN, >>>>> $3 AS BookTitle, >>>>> ... >>>>> >>>>> As soon as I try to access field $5, Pig complains about the index >>>>> being out of bounds. >>>>> >>>>> Is there a way to solve the schema/reality mismatch? Am I doing >>>>> something wrong, or have I stumbled across a defect? >>>>> >>>>> Thanks, >>>>> Chad >>>>> >>>> >>>> >>> >> >