Re: How to report documentation errors?

2012-08-31 Thread Miyakawa Taku
I'll do that. Thank you. 2012/9/1 Russell Jurney : > Report them together with a patch that fixes them. > > Russell Jurney http://datasyndrome.com > > On Aug 31, 2012, at 10:02 PM, Miyakawa Taku wrote: > >> Hello. >> >> I found 26 errors on "Pig Latin Basics" while translating the document >> to

Re: How to report documentation errors?

2012-08-31 Thread Russell Jurney
Report them together with a patch that fixes them. Russell Jurney http://datasyndrome.com On Aug 31, 2012, at 10:02 PM, Miyakawa Taku wrote: > Hello. > > I found 26 errors on "Pig Latin Basics" while translating the document > to Japanese. > (https://github.com/miyakawataku/pig/issues) > > Shou

How to report documentation errors?

2012-08-31 Thread Miyakawa Taku
Hello. I found 26 errors on "Pig Latin Basics" while translating the document to Japanese. (https://github.com/miyakawataku/pig/issues) Should I report each of them as an individual issue? Or should I report minor errors together as one issue? Thanks.

Re: wrong sort order (lexical vs numeric) in a nested foreach

2012-08-31 Thread Dmitriy Ryaboy
I tried to reproduce this and haven't been able to -- all my devious attempts to get something that is actually a string to show up as an int in "describe" wind up in class cast exceptions and blown up jobs (not devious enough, clearly). Can you give put together an example that reproduces the iss

Re: Custom DB Loader UDF

2012-08-31 Thread Russell Jurney
I've thought about that, but getting stuff into Piggybank is hard - you have to peg it to a Pig release. My plan is to get MySQL working in github, then generalize into DbStorage in piggybank for 0.11. On Fri, Aug 31, 2012 at 4:50 PM, Ruslan Al-Fakikh < ruslan.al-fak...@jalent.ru> wrote: > Terry,

Re: Custom DB Loader UDF

2012-08-31 Thread Ruslan Al-Fakikh
Terry, Russell, Just a proposal: maybe it should be added to DBStorage? http://pig.apache.org/docs/r0.8.1/api/org/apache/pig/piggybank/storage/DBStorage.html As far as I know it only stores data for now, but i think it can be extended to load and store, like PigStorage. Ruslan On Sat, Sep 1, 201

RE: group by clickstream

2012-08-31 Thread Steve Bernstein
Nope, tried that, it breaks it back into one tuple per record...not what I want. -Original Message- From: Віталій Тимчишин [mailto:tiv...@gmail.com] Sent: Friday, August 31, 2012 1:49 PM To: user@pig.apache.org Subject: Re: group by clickstream Hello. Does not FLATTEN do exactly this?

Re: Custom DB Loader UDF

2012-08-31 Thread Russell Jurney
That would be awesome - I will generalize it and blog about what a great person you are :D On Fri, Aug 31, 2012 at 3:12 PM, Terry Siu wrote: > Thanks, Russell, I'll dig in to your recommendations. I'd be happy to open > source it, but at the moment, it's not exactly general enough. However, I >

Re: Custom DB Loader UDF

2012-08-31 Thread Ruslan Al-Fakikh
Terry, Probably I can mislead you in some way, I haven't implemented our loader myself but what we have is something like @Override public void setLocation(String string, Job job) throws IOException { String path = ...//load data to hdfs and return the path

RE: Custom DB Loader UDF

2012-08-31 Thread Terry Siu
Thanks, Russell, I'll dig in to your recommendations. I'd be happy to open source it, but at the moment, it's not exactly general enough. However, I can certainly put it on github for your perusal. -Terry -Original Message- From: Russell Jurney [mailto:russell.jur...@gmail.com] Sent: F

Re: Custom DB Loader UDF

2012-08-31 Thread Russell Jurney
I don't have an answer, and I'm only learning these APIs myself, but you're writing something I'm planning on writing very soon - a MySQL-specific LoadFunc for Pig. I would greatly appreciate it if you would open source it on github or contribute it to Piggybank :) The InputSplits should determine

RE: Custom DB Loader UDF

2012-08-31 Thread Terry Siu
Hi Ruslan, Yep, I heard of Sqoop and had originally thought of using that, but wanted to give the LoaderFunc a try first. With regards to overriding the setLocation, I'm not sure I understand how you're using it to cache your DB data to HDFS. Ultimately, the location is used (per the documentat

Re: Custom DB Loader UDF

2012-08-31 Thread Ruslan Al-Fakikh
Hi Terry, I am not sure whether you architecture is correct, but what we do in my team: we override setLocation in LoadFunc so that it caches db data to hdfs. Basically the simplest way is to copy data from MySQL to HDFS by Sqoop and then read it by Pig as a normal input. Ruslan On Sat, Sep 1, 2

Custom DB Loader UDF

2012-08-31 Thread Terry Siu
Hi all, I know this question has probably been posed multiple times, but I'm having difficulty figuring out a couple of aspects of a custom LoaderFunc to read from a DB. And yes, I did try to Google my way to an answer. Anyhoo, for what it's worth, I have a MySql table that I wish to load via P

Re: wrong sort order (lexical vs numeric) in a nested foreach

2012-08-31 Thread Віталій Тимчишин
I'd try to describe original schema as varchar and the cast during order by, e.g order relation by (char)orderkey1; If pig does not accept cast in order, try to add additional foreach with cast. Last resort could be a udf that does the cast. 2012/8/31 Lauren Blau > Could this be a problem with

Fwd: Question Regarding HBaseStorage Pig 0.8.1

2012-08-31 Thread Dan Therrien
Originally sent this on this thread http://www.mail-archive.com/user%40pig.apache.org/msg06085.html but can't find out how to reply to the thread. (Buttons at the bottom weren't working) I'm getting an error instantiating HBaseStorage ONLY when run on a cluster. Running in local mode with -x loc

Re: group by clickstream

2012-08-31 Thread Віталій Тимчишин
Hello. Does not FLATTEN do exactly this? Best regards, Vitalii Tymchyshyn 2012/8/30 Steve Bernstein > Some clarification on the below. Ignore the outer bag, I'd removed some > data elements for clarity and simplicity. Basically, I'm trying to find a > way to go from: > > {(pg),(pg),...,(pg)}

Re: can't get pig to run under CDH4

2012-08-31 Thread Cheolsoo Park
Sure. I will send it to you. Thanks, Cheolsoo On Fri, Aug 31, 2012 at 10:13 AM, Leach, David < david_le...@cable.comcast.com> wrote: > I have a similar problem as this saw the below response. > > ** ** > > I was wondering if you could send me the pig-withouthadoop.jar that > includes PIG-21

can't get pig to run under CDH4

2012-08-31 Thread Leach, David
I have a similar problem as this saw the below response. I was wondering if you could send me the pig-withouthadoop.jar that includes PIG-2115. Also, I sest the zookeeper.quorum in /etc/pig/conf/pig.properties. I don't have the conf under /usr/lib. Thanks. Dave Hi Hari

Re: Loading Map's in Pig

2012-08-31 Thread Srini
Thanks Pablo and Cheolsoo .. On Thu, Aug 30, 2012 at 1:45 PM, pablomar wrote: > good finding ! > > > On Thu, Aug 30, 2012 at 1:43 PM, Cheolsoo Park >wrote: > > > Looking at PigStorage source code, this looks like what's happening. > > > > When the comma ',' is the delimiter, PigStorage splits th

Re: wrong sort order (lexical vs numeric) in a nested foreach

2012-08-31 Thread Lauren Blau
Could this be a problem with the original read of the data. It is stored in Json format and read with a custom Json loader. If I save the results of the loader to a file using PigStorage and then run the same script reading from that file the sort is done numerically. I've had other pig script pro