If you are writing from Pig using HCatStorer you don't need to create HCatSchema. https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore#HCatalogLoadStore-Usage.1has examples on how to do it.
So if you create a Hive table that use ORC you should be able to write your Pig cursor to that table with 1 line command in your Pig script. Eugene On Sun, Apr 6, 2014 at 4:17 PM, Abhishek Girish <[email protected]> wrote: > Hi, > > I am working on a custom Pig source code that writes RDF data into text > files. I was looking to instead *write to an ORCFile* for some of the > columnar benefits it offers. > > I understand that I need to use *HCatalog APIs*. I have an idea on how to > create HCatSchema for my data. And that I would need to use the > HCatOutputFormat for writing into ORCFile. > > I need some help on *how to specify the storage format as ORCFile.* I see > that ORC has built-in support. But I cannot find any examples as to how to > specify which output format the HCatalog APIs can write to (default Hive > table or RCFile or ORCFile or Sequence File etc..). > > I would then need to work on reading from these ORCFiles and reconstruct > the records. > > Any pointers would be appreciated. Thanks in advance. > > Regards, > Abhishek > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
