Re: How do you load data from S3 on Amazon EMR with Pig 0.10.0?

2012-06-21 Thread Russell Jurney
Oh, it is https://issues.apache.org/jira/browse/PIG-2539 On Thu, Jun 21, 2012 at 6:59 PM, Russell Jurney wrote: > cd s3://elasticmapreduce/samples/pig-apache/input/ > > 2012-06-22 01:58:56,685 [main] ERROR org.apache.pig.tools.grunt.Grunt - > ERROR 2999: Unexpected internal error. This file syste

Re: How do you load data from S3 on Amazon EMR with Pig 0.10.0?

2012-06-21 Thread Russell Jurney
cd s3://elasticmapreduce/samples/pig-apache/input/ 2012-06-22 01:58:56,685 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. This file system object (hdfs:// 10.4.115.51:9000) does not support access to the request path 's3://elasticmapreduce/samples/pig-apache

Re: Is there a loader that loads a file as a line?

2012-06-21 Thread Mohammad Tariq
Hello Jonathan, Have a look at Hadoop's WholeFileInputFormat..Might fit into your requirements. Regards,     Mohammad Tariq On Fri, Jun 22, 2012 at 3:39 AM, Prashant Kommireddi wrote: > I think you will need to implement a RecordReader/InputFormat of your own > for this and use it with a

Re: Is there a loader that loads a file as a line?

2012-06-21 Thread Prashant Kommireddi
I think you will need to implement a RecordReader/InputFormat of your own for this and use it with a LoadFunc. Not sure if Hadoop has a Reader that you could re-use for this. How do you handle the case when a file exceeds block size? On Thu, Jun 21, 2012 at 2:34 PM, Jonathan Coveney wrote: > It

Is there a loader that loads a file as a line?

2012-06-21 Thread Jonathan Coveney
It can even be a bytearray. Basically I have a bunch of files, and I want one file -> one row. Is there an easy way to do this? Or will I need to provide a special fileinputformat etc?

Re: Some proposals for Pig performance optimization

2012-06-21 Thread Thejas Nair
bcc'ing the user list. 1. Order-by The comparison against hive order-by is misleading. Hive does not do total ordering, unless you use a single reducer. But yes, in case of pig, the sampling phase is unnecessary, if you use a single reducer. A single reducer can make sense if the data you are

Re: Is it possible to implement transpose with PigLatin/any other MR language?

2012-06-21 Thread Robert Evans
That may be true, I have not read through the code very closely, if you have multiple reduces, so you can run it with a single reduce or you can write a custom partitioner to do it. You only need to know the length of the column, and then you can divide them up appropriately, kind of like how

Re: Is it possible to implement transpose with PigLatin/any other MR language?

2012-06-21 Thread Norbert Burger
While it may be fine for many cases, If I'm reading the Nectar code correctly, that transpose doesn't guarantee anything about the order of rows within each column. In other words, transposing: a - b -c d - e - f g - h - i may give you different permutations of "a - d - g" as the first row, depe

Some proposals for Pig performance optimization

2012-06-21 Thread Jie Li
Hello everyone, I compiled a list of possible optimizaiton for Pig's performance. https://cwiki.apache.org/confluence/display/PIG/Pig+Performance+Optimization As I haven't been very familiar with the codebase, I'm likely to underestimate the complexity involved, so any input will be appreciated.

Re: Is it possible to implement transpose with PigLatin/any other MR language?

2012-06-21 Thread madhu phatak
Hi, Its possible in Map/Reduce. Look into the code here https://github.com/zinnia-phatak-dev/Nectar/tree/master/Nectar-regression/src/main/java/com/zinnia/nectar/regression/hadoop/primitive/mapreduce 2012/6/21 Subir S > Hi, > > Is it possible to implement transpose operation of rows into colu

Re: .eclipse.templates not found

2012-06-21 Thread Harsh J
Hi, You can't run that command unless your Pig is a checkout from the svn/git branch of 0.10 (or tag). So instead of unpacking a release there, do a proper checkout and then run ant eclipse-files. On Thu, Jun 21, 2012 at 1:39 PM, Keren Ouaknine wrote: > Hello, > > I am building from $PIG_HOME un

Re: Pig 0.10.0 and Hadoop 0.23

2012-06-21 Thread Johannes Schwenk
Am 20.06.2012 20:21, schrieb Daniel Dai: > Have you recompiled Pig using -Dhadoopversion=23? No, thanks that solved the issue at hand! I have other errors though. My test cases return errors like http://pastebin.com/XpszYeu2 So I am still doing something wrong it seems?! Thanks, Johannes > O

Is it possible to implement transpose with PigLatin/any other MR language?

2012-06-21 Thread Subir S
Hi, Is it possible to implement transpose operation of rows into columns and vice versa... i.e. col1 col2 col3 col4 col5 col6 col7 col8 col9 col10 col11 col12 can this be converted to col1 col4 col7 col10 col2 col5 col8 col11 col3 col6 col9 col12 Is this even possible with map reduce? If yes

.eclipse.templates not found

2012-06-21 Thread Keren Ouaknine
Hello, I am building from $PIG_HOME under Hadoop 23: ant eclipse-files and I get the following error: BUILD FAILED /home/kereno/hadoop23/pig10/build.xml:301: /home/kereno/hadoop23/pig10/.eclipse.templates not found. I found a Jira issue 7653 with that problem, but the patch wont work for me: [ker