Hi,
I'm trying to read the Avro file i stored on HDFS, but I seem to be
hitting a snag. I'm hoping some of you will be able to shed some light
on this and allow me to continue my adventure!
REGISTER 'hdfs:///lib/avro-1.7.2.jar';
REGISTER 'hdfs:///lib/json-simple-1.1.1.jar';
REGISTER
It seems that everyone can build elephant-bird but me:
https://github.com/kevinweil/elephant-bird/issues/272
On Sun, Nov 18, 2012 at 7:31 PM, Arian Pasquali ar...@arianpasquali.comwrote:
I dont think you really need to build it.
you can find it at any maven repository.
Arian Rodrigo
+1 as well, but I'd suggest we do the following:
- Keep mProtoTuple private and add protected getters/setters instead with
javadocs describing expected usage.
- Rename mProtoTuple and the getters/setters to something more descriptive
than mProtoTuple.
On Fri, Nov 16, 2012 at 2:15 PM, Dmitriy
sure. My initial (and dirty) idea changed only 2 lines. I completely agree
with you
On Mon, Nov 19, 2012 at 12:16 PM, Bill Graham billgra...@gmail.com wrote:
+1 as well, but I'd suggest we do the following:
- Keep mProtoTuple private and add protected getters/setters instead with
javadocs
Hi Bart,
Please try to print out the schema of 'avro' using 'DESCRIBE avro'. This
will show you the field names in the relation.
avro = load '/import/2012-01-04-deflate.**avro' USING AvroStorage();
DESCRIBE avro;
Given your description, I suppose that changing 'trace.terminalid' to
Got it building. Are google collections and json-simple external deps?
On Mon, Nov 19, 2012 at 11:23 AM, Russell Jurney
russell.jur...@gmail.comwrote:
It seems that everyone can build elephant-bird but me:
https://github.com/kevinweil/elephant-bird/issues/272
On Sun, Nov 18, 2012 at 7:31
Talking to myself... never mind, guava and json-simple are included with
Pig.
On Mon, Nov 19, 2012 at 2:27 PM, Russell Jurney russell.jur...@gmail.comwrote:
Got it building. Are google collections and json-simple external deps?
On Mon, Nov 19, 2012 at 11:23 AM, Russell Jurney
Wait... com.twitter.elephantbird.pig.load.JsonLoader() does not infer the
schema from a record. This is what I was looking for. Looks like I have to
write that myself.
And yes, I understand the tradeoffs in doing so. Assuming a sample is the
overall schema is a big assumption.
On Mon, Nov 19,
Ok, its even worse. My data is a big array.
Am I being negative in saying that JSON and Pig is like a nightmare?
On Mon, Nov 19, 2012 at 2:33 PM, Russell Jurney russell.jur...@gmail.comwrote:
Wait... com.twitter.elephantbird.pig.load.JsonLoader() does not infer the
schema from a record. This
I also ran into same dilemma..here is something that I found easier and
working for me .. I compiled some sources from http://www.json.org/java/
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.util.List;
import org.apache.pig.EvalFunc;
import
Hi Cheolsoo,
The patch works as expected. We've not seen one error in the
test system since we installed the new jar file.
We're only processing ~200 rows at the most when we run the script, not sure
if that helps you narrow down the cause.
I assume we just use the patch you gave
Make a JIRA and attach the patch, please.
2012/11/19 pablomar pablo.daniel.marti...@gmail.com
hi all,
I did it as simple as I could. What about this changes ?
PigStorage.java
original:
private void readField(byte[] buf, int start, int end) {
if (start == end) {
done. PIG-3057 https://issues.apache.org/jira/browse/PIG-3057
On Mon, Nov 19, 2012 at 6:32 PM, Jonathan Coveney jcove...@gmail.comwrote:
Make a JIRA and attach the patch, please.
2012/11/19 pablomar pablo.daniel.marti...@gmail.com
hi all,
I did it as simple as I could. What about
Hi Cheolsoo,
I think We need to upgrade junit to at leaset 4.8. As HBase-0.94 is using
junit-4.10, and we got following warnings during ant ... javadoc.
[javadoc]
org/apache/hadoop/hbase/mapreduce/TestWALPlayer.class(org/apache/hadoop/hbase/mapreduce:TestWALPlayer.class):
warning: Cannot
In pure Pig, you wouldn't do something like this. However, PIg supports
control flow in Python (I really should get on making the JRuby wrapper,
but I digress). You can find docs for this on the pig website. Basically
the control flow is in Python, and you launch jobs from there.
2012/11/19
On a different context, I was once stuck with the same problem but was able
to navigate this using bincond operator.
http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html
Not sure, how you would hack in here.. but i have a feeling it can be
pulled off.
On Mon, Nov 19, 2012 at 8:49
Hi Malcolm,
Thank you for sharing it. I am glad to hear that it worked. :-)
We're only processing ~200 rows at the most when we run the script, not
sure if that helps you narrow down the cause.
Very interesting. That's surprisingly small. In my test, I used 10m rows of
random integers as
Maybe we can run some UT paralleled.
At 2012-11-15 03:12:27,Johnny Zhang xiao...@cloudera.com wrote:
Hi, lulynn_2008:
I am not aware of how to shorten the time.
Johnny
On Tue, Nov 13, 2012 at 7:27 PM, lulynn_2008 lulynn_2...@163.com wrote:
Thanks.
Then my environment is normal.
Is
https://issues.apache.org/jira/browse/PIG-3059
I wanted to make sure people saw this JIRA, as I think it will
dramatically improve Pig. Discussion of this issue is available here:
http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
Russell Jurney
19 matches
Mail list logo