Re: Approaches to storing arbitrary schema in a sequencefile

2012-09-15 Thread Mat Kelcey
I guess I was looking for a quick win for a simple flat schema; a serialisation format feels a bit of overkill for what I'm doing. I might be able to just JSON my way out of this specific problem... Cheers! Mat On 15 September 2012 19:44, Dmitriy Ryaboy wrote: > We tend to write protobuf or thrif

How to force the script finish the job and continue the follow script?

2012-09-15 Thread Haitao Yao
Hi, all I forgot the keyword which force Pig to finish the job and then continue the following script. My job failed because of OOME, so I want to split the jobs into smaller ones but still written in a single pig script(because the script is generated) . Is there any keyw

Re: How can I split the data with more reducers?

2012-09-15 Thread Haitao Yao
No, I also thought it is a mapper , but It surely is a reducer. all the mappers succeeded and the reducer failed. Haitao Yao yao.e...@gmail.com weibo: @haitao_yao Skype: haitao.yao.final On 2012-9-16, at 上午10:08, Haitao Yao wrote: > Hi, > I 'v encountered a problem: the job failed beca

Re: Approaches to storing arbitrary schema in a sequencefile

2012-09-15 Thread Dmitriy Ryaboy
We tend to write protobuf or thrift definition for complex objects, but that introduces severe latency into the development process. I suppose you could try something like kryo (and create a corresponding deserializer for EB).. the core of the problem is that you need to carry around the schema, an

Approaches to storing arbitrary schema in a sequencefile

2012-09-15 Thread Mat Kelcey
Hey all, I've starting using SequenceFiles more and more (in particular the elephant bird load and storage functions) and am wondering what's the best approach is for marshaling between a schema from pig (which can have some arbitrary number of fields) and a sequence files (which must have two fie

Re: Apache Pig slides from the

2012-09-15 Thread Dmitriy Ryaboy
Wow, that's a fantastic presentation Adam! Nice job on all the examples and slides. D On Sat, Sep 15, 2012 at 3:16 AM, Adam Kawa wrote: > Hi All, > > I would like to share my slides from the presentation about Apache Pig > that I gave at the 3rd meeting of WHUG (Warsaw Hadoop User Group) a > cou

Re: Apache Pig slides from the

2012-09-15 Thread Prashant Kommireddi
Thanks for sharing Adam! On Sep 15, 2012, at 3:16 AM, Adam Kawa wrote: > Hi All, > > I would like to share my slides from the presentation about Apache Pig > that I gave at the 3rd meeting of WHUG (Warsaw Hadoop User Group) a > couple of months ago. Here is a link > http://www.slideshare.net/Ada

Re: access schema defined in LOAD statement in custom LoadFunc?

2012-09-15 Thread Alan Gates
Unfortunately, no. I agree we should add that to the LoadFunc interface. Alan. On Sep 15, 2012, at 1:13 AM, Jim Donofrio wrote: > Is there anyway within a LoadFunc to access the schema that a user defines > after AS in a LOAD statement? Is there some property I can access in the > UDFContext

Re: Requesting for guidence

2012-09-15 Thread pablomar
easiest way ? install cloudera's https://ccp.cloudera.com/display/CDH4DOC/CDH4+Installation On Sat, Sep 15, 2012 at 2:31 AM, geethu ... wrote: > Sir, i'm going to use pig for my project. i dn't know how to download. > Please can u tell the steps to install pig in hadoop(Ubuntu OS). > > -- > T

Requesting for guidence

2012-09-15 Thread geethu ...
Sir, i'm going to use pig for my project. i dn't know how to download. Please can u tell the steps to install pig in hadoop(Ubuntu OS). -- Thanking You, Geetha R