Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-18 Thread Nerius Landys
> Right - this was my point.  Dropping the 'as' clause forces you to use > positional specifiers, which don't seem to have the same issue.  Seems like > this would warrant a JIRA, if only to document the distinction a bit better. Yeah but it my example I _am_ using position specifiers in the STRSP

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-18 Thread Norbert Burger
Right - this was my point. Dropping the 'as' clause forces you to use positional specifiers, which don't seem to have the same issue. Seems like this would warrant a JIRA, if only to document the distinction a bit better. Norbert On Fri, May 18, 2012 at 1:13 PM, Nerius Landys wrote: > > From

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-18 Thread Nerius Landys
> From what I can tell, this does seem like a bug.  Switching to positional > specifiers seems to work around the issue: > > TEST = FOREACH MOVEMENT GENERATE $3; > POSA = FOREACH TEST GENERATE STRSPLIT($0, '/'); > > Possibly some casting is being applied in one case (positional specifiers) > but no

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-18 Thread Norbert Burger
>From what I can tell, this does seem like a bug. Switching to positional specifiers seems to work around the issue: TEST = FOREACH MOVEMENT GENERATE $3; POSA = FOREACH TEST GENERATE STRSPLIT($0, '/'); Possibly some casting is being applied in one case (positional specifiers) but not the other?

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Nerius Landys
> We ended up using 0.10 on EMR and its been working fine so far... OK a bit of bad news. 0.10 did not fix my problem. I'll recap the entire situation. HADOOP_HOME is set to hadoop-0.20.205.0, Pig version is now pig-0.10.0. File 'bin-proto-4' is: Meta1234567890 foo 34 Movement

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Dan Young
A quick test would be to scp the 0.10 pig.jar over to your master node, and then run: hadoop -jar pig.jar . Run your script in grunt... Dano On May 17, 2012 5:26 PM, "Nerius Landys" wrote: > > Have you tried 0.10? > > No but I can and will try it. I've been using whatever is on Amazon > becaus

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Dan Young
We ended up using 0.10 on EMR and its been working fine so far... Dano On May 17, 2012 5:26 PM, "Nerius Landys" wrote: > > Have you tried 0.10? > > No but I can and will try it. I've been using whatever is on Amazon > because that is the system that we'll be using. > I'll report back on my find

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Nerius Landys
> Have you tried 0.10? No but I can and will try it. I've been using whatever is on Amazon because that is the system that we'll be using. I'll report back on my findings.

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Dan Young
Have you tried 0.10? On May 17, 2012 5:13 PM, "Nerius Landys" wrote: > > What version of pig are you using on EMR? > > hadoop@ip-10-190-83-146:~$ pig --version > Apache Pig version 0.9.2-amzn (rexported) > compiled Apr 06 2012, 23:48:53 >

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Nerius Landys
> What version of pig are you using on EMR? hadoop@ip-10-190-83-146:~$ pig --version Apache Pig version 0.9.2-amzn (rexported) compiled Apr 06 2012, 23:48:53

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Dan Young
What version of pig are you using on EMR? On May 17, 2012 5:02 PM, "Nerius Landys" wrote: > > Did you try to escape the backslash? > > I just tried this: > > POSA = FOREACH TEST GENERATE STRSPLIT(startpos,'\\u002F'); > > ... and still the same result. By the way I'm using a forward slash > for

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Nerius Landys
> Did you try to escape the backslash? I just tried this: POSA = FOREACH TEST GENERATE STRSPLIT(startpos,'\\u002F'); ... and still the same result. By the way I'm using a forward slash for the separator character. I also tried this: POSA = FOREACH TEST GENERATE STRSPLIT(startpos,'/',-1);

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Dan Young
Did you try to escape the backslash? Dano On Thu, May 17, 2012 at 11:57 AM, Nerius Landys wrote: > I'm having problems using Pig's STRSPLIT (on Amazon's cloud computing > environment). > I also noticed that STRSPLIT isn't documented in the Pig Latin > Reference Manual, so I found out about it

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Ranjith
This is pretty interesting. Shot in the dark but can you try the STRSPLIT with -1 and one of the input values, for example, STRSPLIT(abc,'/',-1). Thanks, Ranjith On May 17, 2012, at 4:36 PM, Nerius Landys wrote: >> I did the same but with one changes , that is I changed the file column >> deli

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Nerius Landys
> I did the same but with one changes , that is I changed the file column > delimiter to ',' and it worked. I've tried both '/' and ',' as delimiters for the STRSPLIT function and both fail in my example.

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread krishnan N
Hi , I did the same but with one changes , that is I changed the file column delimiter to ',' and it worked. ((34)) ((1,1)) ((10,1)) Please try the same. Thanks Krishnan On Thu, May 17, 2012 at 10:57 AM, Nerius Landys wrote: > I'm having problems using Pig's STRSPLIT (on Amazon's cloud comput

STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Nerius Landys
I'm having problems using Pig's STRSPLIT (on Amazon's cloud computing environment). I also noticed that STRSPLIT isn't documented in the Pig Latin Reference Manual, so I found out about it through other sources of information. My problem is that in certain cases STRSPLIT returns null. I have no i