Hmmm, that never calls the bytesToLong method even with that specified in the 
schema.  I wonder if it's that when using a Cassandra validator on a column, 
Cassandra tries its best to make the best guess about the value's type which 
may not be compatible with the pig basic types (in this case Cassandra's 
LongType).  So it's never getting to the bytesToX methods of the load caster.  
Talking with Brandon Williams a little, I may need to take it to a higher level 
and do appropriate casting.

On Mar 24, 2011, at 1:42 PM, jacob wrote:

> Hmm. I bet I know what the issue is. It's not fun though. I'm thinking
> that loadcaster probably isn't even called unless you explicitly name
> the types at in the schema declaration.
> 
> Try loading with:
> 
> rows = load 'cassandra://MyKeyspace/MyColumnFamily' using
> CassandraStorage() as (key:chararray, columns: bag{T:
> tuple(name:chararray, value:long)});
> 
> To see if it correctly treats the longtype values. This isn't good
> though since obviously not all of the values are longs. However, if it
> does work for the longtypes we know we're on the right track.
> 
> We may have to go in and explicitly check the types of each column and
> cast manually.
> 
> --jacob
> 
> On Thu, 2011-03-24 at 13:11 -0500, Jeremy Hanna wrote:
>> I see that there are a few LoadCaster implementations in pig 0.8.  There's 
>> the Utf8StorageConverter, the HBaseBinaryConverter, and a couple of others.
>> 
>> The HBaseStorage class uses the Utf8StorageConverter by default but can be 
>> configured to use the HBaseBinaryConverter.  Also it's just used as a 
>> LoadCaster and I don't see where it uses a StoreCaster at all - like the 
>> LoadFunc interface has a getLoadCaster method to override, but I can't find 
>> anything that has a getStoreCaster or getLoadStoreCaster method to override.
>> 
>> Anyway, so I'm using the Cassandra loadfunc and getting LongType data 
>> returned with some special characters and I thought it might be because I'm 
>> not using a LoadCaster to convert to Pig types.  So I tried both the 
>> UtfStorageConverter as well as created my own CassandraBinaryConverter 
>> (implementing LoadStoreCaster) to convert from Cassandra types to and from 
>> Pig basic types.  Neither work though and I'm still getting the special 
>> character stuff when I dump to the console.
>> 
>> Any ideas on why LongTypes would be returning something like this: � as a 
>> value in a tuple?  It's showing up just as a normal Long value on the 
>> cassandra cli.  Oh, and I'm loading it with:
> 
>> rows = load 'cassandra://MyKeyspace/MyColumnFamily' using CassandraStorage() 
>> as (key, columns: bag{T: tuple(name, value)});
>> A = limit rows 10;
>> dump A;
>> 
>> The value is the thing that is coming out seemingly encoded.
>> 
>> Thanks,
>> 
>> Jeremy
> 
> 

Reply via email to