I also just found out that the bag from the nested order by is
org.apache.pig.data.InternalCachedBag and not org.apache.pig.data.SortedDataBag
should be like that?
On 28 Φεβ 2014, at 1:51 π.μ., Anastasis Andronidis
wrote:
> Hi again,
>
> I added this in my UDF:
>
> if(!((DataBag) input.
Hi again,
I added this in my UDF:
if(!((DataBag) input.get(0)).isSorted()) {
throw new IOException("It's not sorted");
}
And the exception arises. Why? I don't understand it. I specified ORDER BY in
the nested foreach.
Thank you for helping me btw!
On 28 Φεβ 2014, at 1:12 π
No... that wouldn't be related since you're not doing a GROUP ALL.
The `FLATTEN(MY_UDF(t))` has me a little weary. Something is possibly going
wrong in your UDF. The output of your UDF is going to be a string that is
some generic status right? My uneducated guess is that there's a bug in
your UDF.
BTW, is this some how related[1] ?
[1]:
http://mail-archives.apache.org/mod_mbox/pig-user/201102.mbox/%3c5528d537-d05c-47d9-8bc8-cc68e236a...@yahoo-inc.com%3E
On 27 Φεβ 2014, at 11:20 μ.μ., Anastasis Andronidis
wrote:
> Yes, of course, my output is like that:
>
> (20131209,AEGIS04-KG,ch.cer
Yes, of course, my output is like that:
(20131209,AEGIS04-KG,ch.cern.sam.ROC_CRITICAL,0.0,CREAM-CE)
(20131209,AEGIS04-KG,ch.cern.sam.ROC_CRITICAL,0.0,CREAM-CE)
(20131209,AEGIS04-KG,ch.cern.sam.ROC_CRITICAL,0.0,SRMv2)
(20131209,AEGIS04-KG,ch.cern.sam.ROC_CRITICAL,0.0,SRMv2)
(20131209,AM-02-SEUA,ch.
Where exactly are you getting duplicates? I'm not sure I understand your
question. Can you give an example please?
On Thu, Feb 27, 2014 at 11:15 AM, Anastasis Andronidis <
andronat_...@hotmail.com> wrote:
> Hello everyone,
>
> I have a foreach statement and inside of it, I use an order by. After
Hello everyone,
I have a foreach statement and inside of it, I use an order by. After the order
by, I have a UDF. Example like this:
logs = LOAD 'raw_data' USING org.apache.hcatalog.pig.HCatLoader();
logs_g = GROUP logs BY (date, site, profile) PARALLEL 2;
service_flavors = FOREACH logs_g {
I forgot I am using cassandra 2.04 , hadoop 1.2.1 and pig 0.12
Thanks
2014-02-27 17:29 GMT+01:00 Miguel Angel Martin junquera <
mianmarjun.mailingl...@gmail.com>:
> HI all,
>
> I trying to do a cogroup with five relations that I load from cassandra
> previously.
>
> In single node and local cas
HI all,
I trying to do a cogroup with five relations that I load from cassandra
previously.
In single node and local casandra testing environment the script works fine
but when I try to execute in a cluster over AWS instances with only one
slave in hadoop cluster and One seed cassandra node I ha
I am just curious why would you want to do that. Because you can use the
nulls from pig in the same way as empty values. At the end, if you output
them back, you won't get null values, you would get empty values only. You
can test this by doing a dump/STORE on the EMPTY relation.
Regards
Prav
On
Hi All,
I am trying to create an alias in pig, which should read records from a csv
file which contains some empty records. But Pig is treating those empty
values(separated by commas) as NULL values. I used the same comma separated
empty values to load data into hive tables where it loads them as
Yeah sure.
'resultData' file contains
lakers nba.com 1
lakers espn.com 2
kings nhl.com 1
kings nba.com 2
4 rows, 3 columns seperated by '\t'
commands are
p = load 'resultData' using TextLoader AS (line:chararray);
q = foreach p generate flatten (REGEX_EXTRACT (line, '(.com).*',1 ) );
r = fore
can you give an example on what's the input and what's your code?
On Thu, Feb 27, 2014 at 5:47 PM, ROHIT LADDHA wrote:
> Hi,
>
> how REGEX_EXTRACT_ALL works? When I use REGEX_EXTRACT, its gives the
> expected result but REGEX_EXTRACT_ALL gives empty result most of the time
> which not expecte
Hi,
how REGEX_EXTRACT_ALL works? When I use REGEX_EXTRACT, its gives the
expected result but REGEX_EXTRACT_ALL gives empty result most of the time
which not expected.
Regards
Rohit
14 matches
Mail list logo