I implemented the following methods in my MyCustomDataConverter which extends AbstractWritableConverter<MyCustomData>
*MyCustomData* toWritable(*DataByteArray *value) *Object* bytesToObject(*DataByteArray* dataByteArray) *ResourceFieldSchema *getLoadSchema() *Tuple* toTuple(*MyCustomData *customData, *ResourceFieldSchema *schema) Are there any more methods that I need to implement ? On Fri, Oct 28, 2011 at 4:09 PM, Gayatri Rao <[email protected]> wrote: > I get the following error > > My script: > > raw = LOAD 'MycustomData.seq' USING > com.twitter.elephantbird.pig.load.SequenceFileLoader( '-c > com.twitter.elephantbird.pig.load.MyCustomDataConverter', '-c > com.twitter.elephantbird.pig.load.NullWritableConverter') ; > first = FOREACH raw GENERATE $0; > userIds= FOREACH first GENERATE key.userId ; > dump userIds; > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias userIds > at org.apache.pig.PigServer.openIterator(PigServer.java:765) > at > com.glassdoor.bigdata.pigUDF.storage.TestSequenceFileLoader.testLoad(TestSequenceFileLoader.java:54) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at junit.framework.TestCase.runTest(TestCase.java:168) > at junit.framework.TestCase.runBare(TestCase.java:134) > at junit.framework.TestResult$1.protect(TestResult.java:110) > at junit.framework.TestResult.runProtected(TestResult.java:128) > at junit.framework.TestResult.run(TestResult.java:113) > at junit.framework.TestCase.run(TestCase.java:124) > at junit.framework.TestSuite.runTest(TestSuite.java:232) > at junit.framework.TestSuite.run(TestSuite.java:227) > at > org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81) > at > org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49) > at > org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) > Caused by: java.io.IOException: Job terminated with anomalous status FAILED > at org.apache.pig.PigServer.openIterator(PigServer.java:755) > ... 20 more > > Thanks > Gayatri > > > On Fri, Oct 28, 2011 at 10:51 AM, Ashutosh Chauhan > <[email protected]>wrote: > >> Please paste the error that you are getting. >> >> Ashutosh >> On Fri, Oct 28, 2011 at 05:49, Gayatri Rao <[email protected]> wrote: >> >> > Sorry that was some bug at my writeFields method. its fixed now and I am >> > able to load and dump the data. >> > In SequenceFileLoader I have defined the corresponding keyconverter and >> > value converter classes. >> > >> > So, when I say >> > raw = load 'in.txt' using SequenceFileLoader; >> > dump raw >> > >> > It dumps the data but when I want to project the fields, it gives an >> error >> > do i have to explicity specify the schema in load ? like: >> > >> > raw = load 'in.txt' using SequenceFileLoader as (t:(a:int, >> > b:chararray,...)) >> > >> > >> > >> > On Wed, Oct 26, 2011 at 1:27 PM, Dmitriy Ryaboy <[email protected]> >> > wrote: >> > >> > > What do you expect to see, how did you create it, and what are the >> weird >> > > values? >> > > Any chance your compression settings are different for writing and >> > reading? >> > > >> > > On Tue, Oct 25, 2011 at 7:41 AM, Gayatri Rao <[email protected]> >> > wrote: >> > > > Thanks Dmitriy. >> > > > I was trying to implement the MyClassConverter for my custom class >> > > > and I overided and implemented the method >> > > > >> > > > @Override >> > > > public Object bytesToObject(DataByteArray dataByteArray) throws >> > > > IOException { >> > > > >> > > > MyClass o = (MyClass) >> ReflectionUtils.newInstance(MyClass.class, >> > > > null); >> > > > o.readFields(new DataInputStream(new >> > > > ByteArrayInputStream(dataByteArray >> > > > .get()))); >> > > > return o; >> > > > >> > > > } >> > > > >> > > > and my MyClass.readFields is as follows: >> > > > >> > > > @Override >> > > > public void readFields(DataInput in) throws IOException { >> > > > num = in.readInt(); >> > > > list = new ArrayList<String>(); >> > > > for (int i = 0; i < 3; i++) { >> > > > list.add(WritableUtils.readString(in)); >> > > > } >> > > > >> > > > } >> > > > >> > > > This puts some weird data in num and list. Any idea what I might be >> > doing >> > > > wrong? >> > > > >> > > > >> > > > On Tue, Oct 25, 2011 at 9:03 AM, Dmitriy Ryaboy <[email protected] >> > >> > > wrote: >> > > > >> > > >> you can compile it with "ant -Dnothrift=true" >> > > >> >> > > >> There's also a "-Dnoprotobuf=true" option, but I just tried it and >> it >> > > seems >> > > >> we do require protobufs in 1 place that's not excluded when we skip >> > > >> protocol >> > > >> buffers, so you still need protoc version 2.3 >> > > >> >> > > >> D >> > > >> >> > > >> On Mon, Oct 24, 2011 at 6:52 PM, Gayatri Rao <[email protected]> >> > > wrote: >> > > >> >> > > >> > Thats great, thanks, I ll check it out. Is thrift a dependency >> for >> > > >> > building? >> > > >> > >> > > >> > On Mon, Oct 24, 2011 at 6:49 PM, Dmitriy Ryaboy < >> [email protected] >> > > >> > > >> > wrote: >> > > >> > >> > > >> > > Correct -- it's completely rewritten. >> > > >> > > >> > > >> > > We haven't published EB to a public maven repo, though I >> believe >> > we >> > > did >> > > >> > add >> > > >> > > a "maven-install" ant target to publish to your local maven >> repo. >> > > >> > > >> > > >> > > D >> > > >> > > >> > > >> > > On Mon, Oct 24, 2011 at 6:33 PM, Gayatri Rao < >> [email protected] >> > > >> > > >> > wrote: >> > > >> > > >> > > >> > > > I have checked the SequenceFileLoader from elephantbird and >> it >> > > seems >> > > >> > to >> > > >> > > > use >> > > >> > > > a different SequenceFileLoader as oppose to the one there is >> in >> > > >> > piggybank >> > > >> > > > Is there any reason for that? >> > > >> > > > >> > > >> > > > On Mon, Oct 24, 2011 at 5:57 PM, Gayatri Rao < >> > [email protected] >> > > > >> > > >> > > wrote: >> > > >> > > > >> > > >> > > > > Thank Dmitriy. Are the jars available in maven repository? >> > > >> > > > > >> > > >> > > > > Thanks, >> > > >> > > > > Gayatri >> > > >> > > > > >> > > >> > > > > >> > > >> > > > > On Mon, Oct 24, 2011 at 11:55 AM, Dmitriy Ryaboy < >> > > >> [email protected] >> > > >> > > > >wrote: >> > > >> > > > > >> > > >> > > > >> We have a massively improved (well, rewritten from >> scratch) >> > > >> > > > SequenceLoader >> > > >> > > > >> in elephantbird. Take a look here: >> > > >> > > > >> >> > > >> > > > >> >> > > >> > > > >> > > >> > > >> > > >> > >> > > >> >> > > >> > >> https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/pig/load/SequenceFileLoader.java >> > > >> > > > >> >> > > >> > > > >> No separate readme on usage, but all the related classes >> are >> > > >> > > > >> well-documented >> > > >> > > > >> in Javadocs. >> > > >> > > > >> >> > > >> > > > >> D >> > > >> > > > >> >> > > >> > > > >> On Mon, Oct 24, 2011 at 12:55 AM, Daniel Dai < >> > > >> [email protected] >> > > >> > > >> > > >> > > > >> wrote: >> > > >> > > > >> >> > > >> > > > >> > I think it is the >> > > >> > SequenceFileLoader.translateWritableToPigDataType >> > > >> > > > >> which >> > > >> > > > >> > does not support custom writable. Try to enhance >> > > >> > > > >> > translateWritableToPigDataType. >> > > >> > > > >> > >> > > >> > > > >> > Daniel >> > > >> > > > >> > >> > > >> > > > >> > On Mon, Oct 24, 2011 at 12:21 AM, Gayatri Rao < >> > > >> > [email protected]> >> > > >> > > > >> wrote: >> > > >> > > > >> > >> > > >> > > > >> > > Hi All, >> > > >> > > > >> > > >> > > >> > > > >> > > I am trying to use the sequence file loader in >> piggybank >> > > for >> > > >> my >> > > >> > > > custom >> > > >> > > > >> > > writable object. I am working with pig 0.8, It looks >> like >> > > it >> > > >> > does >> > > >> > > > not >> > > >> > > > >> > work >> > > >> > > > >> > > for user defined custom writables? >> > > >> > > > >> > > Any pointers on how I can write a loader for my own >> > custom >> > > >> > > writable? >> > > >> > > > >> > > >> > > >> > > > >> > > Thanks, >> > > >> > > > >> > > Gayatri >> > > >> > > > >> > > >> > > >> > > > >> > >> > > >> > > > >> >> > > >> > > > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > >> > >> > > >> >> > > > >> > > >> > >> > >
