date:20140227

Re: Nested foreach with order by

2014-02-27 Thread Anastasis Andronidis

I also just found out that the bag from the nested order by is org.apache.pig.data.InternalCachedBag and not org.apache.pig.data.SortedDataBag should be like that? On 28 Φεβ 2014, at 1:51 π.μ., Anastasis Andronidis wrote: > Hi again, > > I added this in my UDF: > > if(!((DataBag) input.

Re: Nested foreach with order by

2014-02-27 Thread Anastasis Andronidis

Hi again, I added this in my UDF: if(!((DataBag) input.get(0)).isSorted()) { throw new IOException("It's not sorted"); } And the exception arises. Why? I don't understand it. I specified ORDER BY in the nested foreach. Thank you for helping me btw! On 28 Φεβ 2014, at 1:12 π

Re: Nested foreach with order by

2014-02-27 Thread Pradeep Gollakota

No... that wouldn't be related since you're not doing a GROUP ALL. The `FLATTEN(MY_UDF(t))` has me a little weary. Something is possibly going wrong in your UDF. The output of your UDF is going to be a string that is some generic status right? My uneducated guess is that there's a bug in your UDF.

Re: Nested foreach with order by

2014-02-27 Thread Anastasis Andronidis

BTW, is this some how related[1] ? [1]: http://mail-archives.apache.org/mod_mbox/pig-user/201102.mbox/%3c5528d537-d05c-47d9-8bc8-cc68e236a...@yahoo-inc.com%3E On 27 Φεβ 2014, at 11:20 μ.μ., Anastasis Andronidis wrote: > Yes, of course, my output is like that: > > (20131209,AEGIS04-KG,ch.cer

Re: Nested foreach with order by

2014-02-27 Thread Anastasis Andronidis

Yes, of course, my output is like that: (20131209,AEGIS04-KG,ch.cern.sam.ROC_CRITICAL,0.0,CREAM-CE) (20131209,AEGIS04-KG,ch.cern.sam.ROC_CRITICAL,0.0,CREAM-CE) (20131209,AEGIS04-KG,ch.cern.sam.ROC_CRITICAL,0.0,SRMv2) (20131209,AEGIS04-KG,ch.cern.sam.ROC_CRITICAL,0.0,SRMv2) (20131209,AM-02-SEUA,ch.

Re: Nested foreach with order by

2014-02-27 Thread Pradeep Gollakota

Where exactly are you getting duplicates? I'm not sure I understand your question. Can you give an example please? On Thu, Feb 27, 2014 at 11:15 AM, Anastasis Andronidis < andronat_...@hotmail.com> wrote: > Hello everyone, > > I have a foreach statement and inside of it, I use an order by. After

Nested foreach with order by

2014-02-27 Thread Anastasis Andronidis

Hello everyone, I have a foreach statement and inside of it, I use an order by. After the order by, I have a UDF. Example like this: logs = LOAD 'raw_data' USING org.apache.hcatalog.pig.HCatLoader(); logs_g = GROUP logs BY (date, site, profile) PARALLEL 2; service_flavors = FOREACH logs_g {

Re: Issue with cassandra-pig-thrift.transport and java.net.SocketException: Connection reset

2014-02-27 Thread Miguel Angel Martin junquera

I forgot I am using cassandra 2.04 , hadoop 1.2.1 and pig 0.12 Thanks 2014-02-27 17:29 GMT+01:00 Miguel Angel Martin junquera < mianmarjun.mailingl...@gmail.com>: > HI all, > > I trying to do a cogroup with five relations that I load from cassandra > previously. > > In single node and local cas

Issue with cassandra-pig-thrift.transport and java.net.SocketException: Connection reset

2014-02-27 Thread Miguel Angel Martin junquera

HI all, I trying to do a cogroup with five relations that I load from cassandra previously. In single node and local casandra testing environment the script works fine but when I try to execute in a cluster over AWS instances with only one slave in hadoop cluster and One seed cassandra node I ha

Re: How should I insert EMPTY values into pig

2014-02-27 Thread praveenesh kumar

I am just curious why would you want to do that. Because you can use the nulls from pig in the same way as empty values. At the end, if you output them back, you won't get null values, you would get empty values only. You can test this by doing a dump/STORE on the EMPTY relation. Regards Prav On

How should I insert EMPTY values into pig

2014-02-27 Thread Arpit Maheshwari

Hi All, I am trying to create an alias in pig, which should read records from a csv file which contains some empty records. But Pig is treating those empty values(separated by commas) as NULL values. I used the same comma separated empty values to load data into hive tables where it loads them as

Re: pig REGEX_EXTRACT_ALL

2014-02-27 Thread ROHIT LADDHA

Yeah sure. 'resultData' file contains lakers nba.com 1 lakers espn.com 2 kings nhl.com 1 kings nba.com 2 4 rows, 3 columns seperated by '\t' commands are p = load 'resultData' using TextLoader AS (line:chararray); q = foreach p generate flatten (REGEX_EXTRACT (line, '(.com).*',1 ) ); r = fore

Re: pig REGEX_EXTRACT_ALL

2014-02-27 Thread Nitin Pawar

can you give an example on what's the input and what's your code? On Thu, Feb 27, 2014 at 5:47 PM, ROHIT LADDHA wrote: > Hi, > > how REGEX_EXTRACT_ALL works? When I use REGEX_EXTRACT, its gives the > expected result but REGEX_EXTRACT_ALL gives empty result most of the time > which not expecte

pig REGEX_EXTRACT_ALL

2014-02-27 Thread ROHIT LADDHA

Hi, how REGEX_EXTRACT_ALL works? When I use REGEX_EXTRACT, its gives the expected result but REGEX_EXTRACT_ALL gives empty result most of the time which not expected. Regards Rohit

Re: Nested foreach with order by

Re: Nested foreach with order by

Re: Nested foreach with order by

Re: Nested foreach with order by

Re: Nested foreach with order by

Re: Nested foreach with order by

Nested foreach with order by

Re: Issue with cassandra-pig-thrift.transport and java.net.SocketException: Connection reset

Issue with cassandra-pig-thrift.transport and java.net.SocketException: Connection reset

Re: How should I insert EMPTY values into pig

How should I insert EMPTY values into pig

Re: pig REGEX_EXTRACT_ALL

Re: pig REGEX_EXTRACT_ALL

pig REGEX_EXTRACT_ALL

14 matches

Site Navigation

Mail list logo

Footer information