I see that there are a few LoadCaster implementations in pig 0.8. There's the Utf8StorageConverter, the HBaseBinaryConverter, and a couple of others.
The HBaseStorage class uses the Utf8StorageConverter by default but can be configured to use the HBaseBinaryConverter. Also it's just used as a LoadCaster and I don't see where it uses a StoreCaster at all - like the LoadFunc interface has a getLoadCaster method to override, but I can't find anything that has a getStoreCaster or getLoadStoreCaster method to override. Anyway, so I'm using the Cassandra loadfunc and getting LongType data returned with some special characters and I thought it might be because I'm not using a LoadCaster to convert to Pig types. So I tried both the UtfStorageConverter as well as created my own CassandraBinaryConverter (implementing LoadStoreCaster) to convert from Cassandra types to and from Pig basic types. Neither work though and I'm still getting the special character stuff when I dump to the console. Any ideas on why LongTypes would be returning something like this: � as a value in a tuple? It's showing up just as a normal Long value on the cassandra cli. Oh, and I'm loading it with: rows = load 'cassandra://MyKeyspace/MyColumnFamily' using CassandraStorage() as (key, columns: bag{T: tuple(name, value)}); A = limit rows 10; dump A; The value is the thing that is coming out seemingly encoded. Thanks, Jeremy