[ https://issues.apache.org/jira/browse/PIG-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793971#action_12793971 ]
Gerrit Jansen van Vuuren commented on PIG-1117: ----------------------------------------------- OK, will upload the 0.7.0 implementation today, It will still not have an implementation for fieldsToRead just empty method. I'll have a look at it after xmas. > Pig reading hive columnar rc tables > ----------------------------------- > > Key: PIG-1117 > URL: https://issues.apache.org/jira/browse/PIG-1117 > Project: Pig > Issue Type: New Feature > Affects Versions: 0.6.0 > Reporter: Gerrit Jansen van Vuuren > Fix For: 0.7.0 > > Attachments: HiveColumnarLoader.patch, HiveColumnarLoaderTest.patch, > PIG-1117.patch, PIG-117-v.0.6.0.patch > > > I've coded a LoadFunc implementation that can read from Hive Columnar RC > tables, this is needed for a project that I'm working on because all our data > is stored using the Hive thrift serialized Columnar RC format. I have looked > at the piggy bank but did not find any implementation that could do this. > We've been running it on our cluster for the last week and have worked out > most bugs. > > There are still some improvements to be done but I would need like setting > the amount of mappers based on date partitioning. Its been optimized so as to > read only specific columns and can churn through a data set almost 8 times > faster with this improvement because not all column data is read. > I would like to contribute the class to the piggybank can you guide me in > what I need to do? > I've used hive specific classes to implement this, is it possible to add this > to the piggy bank build ivy for automatic download of the dependencies? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.