I am using Pig on Avro data files, and Avro in HBase.

Can you elaborate on what you mean by 'auto-load the schema'?  In the
sense that a big LOAD statement doesn't have to declare the schema?  I do
this with avro data files to some extent (with limitations).

A working implementation of
https://issues.apache.org/jira/browse/AVRO-1124 seems to be the way to go
for tracking a mapping from something like a Table or known file type to a
sequence of schemas (and the most recent schema).  Then a pig loader could
load from HBase using the most recent schema from a named schema group, or
read the same thing from files that represent the same schema group with
an avro file loader.


On 8/22/12 8:37 PM, "Russell Jurney" <russell.jur...@gmail.com> wrote:


>
>Is anyone using Pig with Avro as the datatype in HBase? I want to
>auto-load the schema, and this seems the most direct way to do it.
>
>-- 
>Russell Jurney twitter.com/rjurney <http://twitter.com/rjurney>
>russell.jur...@gmail.com datasyndrome.com <http://datasyndrome.com/>


Reply via email to