Re: Weird behaviour with string min/max

2015-03-31 Thread j.barrett Strausser
Could you share a portion of the data or even the offending values e.g. a Min and Max pair that aren't correct along with the data you expected to be the Min and Max. On Sun, Mar 29, 2015 at 7:46 AM, Ronald Green wrote: > I can share demo data to go with the script. Anyone has any clue? > > On

Re: Pig Schema contains a name that is not allowed in Avro

2013-10-21 Thread j.barrett Strausser
I'd imagime it is having an issue with the duplicate 'd' names? That is my un-informed guess though. On Mon, Oct 21, 2013 at 1:05 PM, Johannes Schwenk < johannes.schw...@adition.com> wrote: > Hi! > > I'm getting the following error when running the script below in grunt > (pig 0.12.0): > > DEFIN

Re: python modules using pig 12.0

2013-10-18 Thread j.barrett Strausser
Does the AWS EMR distro come with numpy either installed as a site package or in a virtual environment? Otherwise I'd think you'd need to do : apt-get install python-numpy or pip install numpy. On Fri, Oct 18, 2013 at 8:33 AM, Mark Olliver < mark.olli...@infectiousmedia.com> wrote: > Hi, > > Wh

Re: AvroStorage Issue - Possibly version related

2013-10-01 Thread j.barrett Strausser
gt; problem. When you go to hadoop 2.x, you will hit the same problem as > json-simple is not in 2.x hadoop installation. I actually hit the error you > mentioned with 1.x while trying 2.x. > > Regards, > Rohini > > > On Mon, Sep 30, 2013 at 6:09 PM, j.barrett Strausser <

Re: AvroStorage Issue - Possibly version related

2013-09-30 Thread j.barrett Strausser
gt; find solution but couldn't. If fixing classloading is not possible, easy > thing would be to change AvroStorage constructor to throw RunTimeException > instead of ParseException > > Regards, > Rohini > > > On Thu, Sep 19, 2013 at 1:10 PM, j.barrett Strausser < >

Re: AvroStorage Issue - Possibly version related

2013-09-19 Thread j.barrett Strausser
ed : json-simple-1.1.jar On Thu, Sep 19, 2013 at 3:14 PM, j.barrett Strausser < j.barrett.straus...@gmail.com> wrote: > Are the releases from the download page not compatible with 23.x? or 2.X > > Says they are - > http://pig.apache.org/releases.html#1+April%2C+2013%3A+release+0

Re: AvroStorage Issue - Possibly version related

2013-09-19 Thread j.barrett Strausser
Pig that wasn't compiled for > Hadoop 2.x/.23. Try recompiling with 'ant clean jar > -Dhadoopversion=23'. > > -Mark > > On Thu, Sep 19, 2013 at 9:23 AM, j.barrett Strausser > wrote: > > Running > > > > Hadoop-2.1.0-Beta > > Pig-0.11.1 > &g

AvroStorage Issue - Possibly version related

2013-09-19 Thread j.barrett Strausser
Running Hadoop-2.1.0-Beta Pig-0.11.1 Hive-0.11.1 1. Created Avro backed table in Hive. 2. Loaded the table in Pig - records = Load '/path' USING org.apache.pig.piggybank.storage.avro.AvroStorage(); 3. Can successfully describe the relation. I registered the following on pig start : REGISTER pigg

Re: Merging files

2013-07-31 Thread j.barrett Strausser
That is what I was suggesting yes. On Wed, Jul 31, 2013 at 4:39 PM, Something Something < mailinglist...@gmail.com> wrote: > So you are saying, we will first do a 'hadoop count' to get the total # of > bytes for all files. Let's say that comes to: 1538684305 > > Default Block Size is: 128M

Re: Merging files

2013-07-31 Thread j.barrett Strausser
Can't you solve for the --max-file-blocks option given that you know the sizes of the input files and desired number of outputfiles? On Wed, Jul 31, 2013 at 12:21 PM, Something Something < mailinglist...@gmail.com> wrote: > Thanks, John. But I don't see an option to specify the # of output fil

Re: Compiling PigUnit

2013-07-18 Thread j.barrett Strausser
I'm unable to recreate this by doing the following : tar xvzf pig-0.11.1.tar.gz cd pig-0.11.1 ant ant pigunit-jar I'm running on Mint 14 % java -version java version "1.7.0_25" Java(TM) SE Runtime Environment (build 1.7.0_25-b15) Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

Re: fuzzy logic through pig programming

2013-06-27 Thread j.barrett Strausser
This is called the levenshtein distance. Since it is a metric you would be responsible for determining the distance two strings could be from one another and still considered the same. I'd implement this as UDF taking two strings,s1 and s2 a float that is between 0 < f < max(len(s1), len(s2)) On

Re: Joining 3 tables in Pig

2013-04-24 Thread j.barrett Strausser
Have you tried it? If so, what was the result? Did you try DESCRIBE Joins; ILLUSTRATE Joins; EXPLAIN Joins: DUMP Joins; If you haven't tried, that would be the first thing to do. http://pig.apache.org/docs/r0.11.1/basic.html#join-inner -b On Wed, Apr 24, 2013 at 10:39 PM, Raj hadoop wrote:

Re: Generate sequence number in Pig

2013-04-24 Thread j.barrett Strausser
You may use the Rank function. http://pig.apache.org/docs/r0.11.1/basic.html#rank -b On Wed, Apr 24, 2013 at 4:25 PM, Raj hadoop wrote: > Hi, > > Can you please help me to generate sequence number using Pig? > > > Raj > -- https://github.com/bearrito @barrettsmash

Re: Pig script from sql query

2013-04-22 Thread j.barrett Strausser
; > > On Tue, Apr 23, 2013 at 2:33 AM, j.barrett Strausser < > j.barrett.straus...@gmail.com> wrote: > > > You'll have more luck if you post the errors. > > > > Off the bat, I assume you are going to have problems given your load > > statement. &g

Re: Pig cannot resolve BamUDFLoader

2013-04-22 Thread j.barrett Strausser
responses! > > -Mehmet > > > > On Apr 22, 2013, at 5:20 PM, j.barrett Strausser wrote: > > > Looks like that loader is related to and available from : > > http://seqpig.sourceforge.net/ > > > > I don't believe the BamUdfLoader is native to p

Re: Pig cannot resolve BamUDFLoader

2013-04-22 Thread j.barrett Strausser
Looks like that loader is related to and available from : http://seqpig.sourceforge.net/ I don't believe the BamUdfLoader is native to pig. -b On Mon, Apr 22, 2013 at 5:06 PM, Mehmet Belgin wrote: > Hi Everyone, > > I have absolutely no experience with Pig and limited experience with > hadoop

Re: Pig script from sql query

2013-04-22 Thread j.barrett Strausser
You'll have more luck if you post the errors. Off the bat, I assume you are going to have problems given your load statement. -b On Mon, Apr 22, 2013 at 4:59 PM, Raj hadoop wrote: > Hi friends, > > I am new to PIG script. I need to convert below sql query to PIG script. > > > SELECT ('CSS'||D

Re: Pigunit Failure 10.1 running locally

2013-04-10 Thread j.barrett Strausser
t version your pig was built against and update your > dependencies to match. > > Regards, > Marcos > > On 10-04-2013 11:05, j.barrett Strausser wrote: > > Greetings all, > > > > I am trying to run Pigunit and receiving an error. I had this previously > >

Re: Pigunit Failure 10.1 running locally

2013-04-10 Thread j.barrett Strausser
from my iPhone > > On Apr 10, 2013, at 7:06 AM, "j.barrett Strausser" > wrote: > > > Greetings all, > > > > I am trying to run Pigunit and receiving an error. I had this previously > > working, but had to rebuild my local workstation and didn't

Pigunit Failure 10.1 running locally

2013-04-10 Thread j.barrett Strausser
Greetings all, I am trying to run Pigunit and receiving an error. I had this previously working, but had to rebuild my local workstation and didn't have everything I should have had checked in. This is too deep into pig/hadoop for me to effectively debug. I'm using scala so my versions/dependenci