statement?
Best,
Ryan Xu
variable is not necessary... a simple instance variable is just
fine.
On Fri Dec 05 2014 at 2:27:53 PM Ryan freelanceflashga...@gmail.com
wrote:
After running it with updated code, it seems like the problem has to do
with something related to Tika since my output says that my input
,
Ryan
using a static variable inside the
ExtractTextFromPDFs function to store the PdfParser once it has been
initialized once? I'm still learning how to best do things within the
Pig/MapReduce/Hadoop framework
Ryan
On Fri, Dec 5, 2014 at 1:35 PM, Ryan freelanceflashga...@gmail.com wrote:
Thanks Pradeep
Vineet,
Pig 0.12 supports the IN clause for filtering
X = FILTER A BY (f1==8) OR (NOT (f2+f3 f1)) OR (f1 IN (9, 10, 11));
Ryan
On Thu, Oct 30, 2014 at 11:09 PM, Vineet Mishra clearmido...@gmail.com
wrote:
Hi Dan,
Thanks for your response, although
FILTER cat_ids BY (category_id == 1
I've found Twitter's elephantbird library very useful here
(https://github.com/kevinweil/elephant-bird )
a = LOAD 'file3.json' USING
com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad')
Will parse the JSON into a map
http://pig.apache.org/docs/r0.11.1/basic.html#map-schema the JSONArray
') AS
(json:map[]);
B = FOREACH A GENERATE
json#'col1' = col1;
Ryan
On Fri, Jul 25, 2014 at 4:55 PM, Satish Kolli feedwo...@gmail.com wrote:
Did you try the standard JsonLoader? I didn't personally use it but it
looks like you can specify the schema to extract/parse from your json
Upgraded to 12.1 and now I'm getting this whenever I try to REGISTER a
jar. I don't use slf4j, so I have no idea what's causing it. Has
anyone else run into it? My Hadoop version is cdh3u3.
Pig Stack Trace
---
ERROR 2998: Unhandled internal error.
Update: Turns out I'm getting it on 11.1 as well. Must be a problem
with something in my jar.
On Tue, May 27, 2014 at 1:26 PM, Ryan Compton compton.r...@gmail.com wrote:
Upgraded to 12.1 and now I'm getting this whenever I try to REGISTER a
jar. I don't use slf4j, so I have no idea what's
: Ryan Compton [mailto:compton.r...@gmail.com]
Sent: Tuesday, November 05, 2013 6:40 PM
To: user@pig.apache.org
Subject: Need example of python code with dependency files
I have some python code I'd like to deploy with a pig script. The .py
code takes input from sys.stdin and outputs to sys.stdout
I have some python code I'd like to deploy with a pig script. The .py
code takes input from sys.stdin and outputs to sys.stdout. It also
needs some parameter files to run properly.
The book Programming Pig tells me:
The workaround for this is to create a TAR file and ship that, and
then have a
It sounds like you have two problems: parsing json and joining the datasets
For reading jsons you can use:
http://stackoverflow.com/questions/11035105/processing-json-through-pig-scripts/16501542#16501542
For matching the types you could filter for type1 then join against
the data_dict_1 and
I often do this, and then just register one giant .jar
!-- Plugin to create a single jar that includes all dependencies --
plugin
artifactIdmaven-assembly-plugin/artifactId
version2.4/version
I've been using twitter's elephantbird and have been very happy with
it so far. Here's an example of parsing a nested json with it:
json_eb = LOAD '$IN_DIRS' USING
com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') as
(json:map[]);
--parse json with twitter's library
parsed0 = FOREACH
I want to run a python script in a pig script.
Here's the .py script: http://pastebin.com/JB26B7BE
Here's the pig script: http://pastebin.com/JvD9t3Si
Here's what happens: http://pastebin.com/4YjENb5q
What could this be?
I can start a grunt shell just fine:
-bash-3.2$ pwd
/home/rfcompton/Downloads/pig-0.11.0-src
-bash-3.2$ ./bin/pig
2013-03-21 12:55:00,048 [main] INFO org.apache.pig.Main - Apache Pig
version 0.11.1-SNAPSHOT (rexported) compiled Mar 21 2013, 12:49:21
2013-03-21 12:55:00,049 [main] INFO
-examples-0.20.2-cdh3u3.jar
hadoop-test-0.20.2-cdh3u3.jar
hadoop-tools-0.20.2-cdh3u3.jar
but I still have the same problem. More info: http://pastebin.com/MfUHwu0X
On Thu, Mar 21, 2013 at 1:16 PM, Prashant Kommireddi
prash1...@gmail.com wrote:
Hi Ryan,
Seems like you are trying to connect to a hadoop
of pig -secretDebugCmd
On Thu, Mar 21, 2013 at 1:29 PM, Ryan Compton compton.r...@gmail.comwrote:
Hi,
Hmm, I've got that much:
-bash-3.2$ ls $HADOOP_HOME | grep cdh3u3
hadoop-0.20.2-cdh3u3-ant.jar
hadoop-0.20.2-cdh3u3-core.jar
hadoop-0.20.2-cdh3u3-examples.jar
hadoop-0.20.2-cdh3u3
-withouthadoop.jar instead of
/home/rfcompton/Downloads/pig-0.11.0-src/pig.jar and make sure 0.20.2
hadoop is on the classpath.
On Thu, Mar 21, 2013 at 1:36 PM, Ryan Compton compton.r...@gmail.comwrote:
-bash-3.2$ pig -secretDebugCmd
Find hadoop at /usr/bin/hadoop
dry run:
HADOOP_CLASSPATH
The classpath for Pig, correct?
Ryan
On Fri, Mar 23, 2012 at 4:00 AM, Sam William sa...@stumbleupon.com wrote:
Ryan,
This message is specific to Hbase 0.92.1 . Make sure HBase 0.90.1 jar
is not in the classpath before the 0.92.1 jar files
Sam
On Mar 22, 2012, at 8:20 PM, Ryan Cole
That was it! I don't think that I even had my HBase path on the
PIG_CLASSPATH, at all. I simply put HBase on the path and now it works.
Thank you,
Ryan
On Fri, Mar 23, 2012 at 10:02 AM, Ryan Cole r...@rycole.com wrote:
The classpath for Pig, correct?
Ryan
On Fri, Mar 23, 2012 at 4:00 AM
even the simplest query examples, using Pig, I get
the following error:
`ERROR 2017: Internal error creating job configuration.`
and, the log file has the following more specific error:
`Caused by: java.lang.IllegalArgumentException: Not a host:port pair:
�^@^@^@^P8948@ryan-serverlocalhost
or
not, though.
Ryan
On Mar 22, 2012, at 10:02 PM, Norbert Burger wrote:
Actually on second glance, this seems like an issue not with the HBase
config, but with the server:port info inside your .META. table. Have you
tried LOADing from a different table besides events? From the HBase
shell
PigStorage(',') AS
(item:chararray,number:int);
MAPPED = JOIN EXAMPLE_SOURCE BY number LEFT OUTER, MAPPINGS BY number;
PRETTY = FOREACH MAPPED GENERATE item, name;
DUMP PRETTY;
(a,one)
(c,one)
(a,two)
(b,two)
(d,three)
(d,four)
--
Ryan Hoegg
On Wed, Sep 14, 2011 at 3:27 PM, Eli Finkelshteyn iefin
,better)
(d,4,better)
--
Ryan Hoegg
On Wed, Sep 14, 2011 at 4:24 PM, Eli Finkelshteyn iefin...@gmail.comwrote:
Sorry, bad example, I guess. I want something I can do case statements
with. In this case I could map instead, but if I wanted to use less
straight-forward cases (i.e. one case where
Is anyone familiar with getting the ivy dependencies to work on trunk?
Thanks again,
Ryan Hoegg
On Thu, Sep 8, 2011 at 4:31 PM, Ryan Hoegg ryan.ho...@gmail.com wrote:
Apache Ant(TM) version 1.8.2 compiled on June 3 2011
On Thu, Sep 8, 2011 at 4:12 PM, Daniel Dai da...@hortonworks.com wrote
: 'master'. It was required from
org.apache.pig#Pig;0.10.0-SNAPSHOT compile
[ivy:resolve] ::
Am I doing something wrong?
Thanks,
Ryan Hoegg
27 matches
Mail list logo