Using variables generated by FOREACH command

2013-11-14 Thread Mix Nin
Hi I have a group and foreach statements as below grouped = GROUP filterdata BY (page_name,web_session_id); x = foreach grouped { distinct_web_cookie_id= DISTINCT filterdata.web_cookie_id; distinct_encrypted_customer_id= DISTINCT filterdata.encrypted_customer_id; distinct_web_session_id= DISTINC

header of a tuple/bag

2013-07-16 Thread Mix Nin
Hi, I am trying query a data set on HDFS using PIG. Data = LOAD '/user/xx/20130523/*; x = FOREACH Data GENERATE cookie_id; I get below error. Invalid field projection. Projected field [cookie_id] does not exist How do i find the column names in the bag "Data" . The developer who created the

Fwd: Number format exception : For input string

2013-07-02 Thread Mix Nin
I wrote a script as below. Data = LOAD 'part-r-0' AS (session_start_gmt:long) FilterData = FILTER Data BY session_start_gmt=1369546091667 I get below error 2013-07-01 22:48:06,510 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: For input string: "1369546091667" In detail log it

Re: Passing multiple parameters to a PIG script

2013-06-20 Thread Mix Nin
... If I send only $inputDate , it's working fine. Am i missing anything? Thanks On Thu, Jun 20, 2013 at 12:02 PM, Mix Nin wrote: > I want to pass multiple parameters to a pig script from a shell script. > > I tried below > > From Shell Script > > pig -f $ROOT_DIR/pig

Passing multiple parameters to a PIG script

2013-06-20 Thread Mix Nin
I want to pass multiple parameters to a pig script from a shell script. I tried below >From Shell Script pig -f $ROOT_DIR/pig0.pig -param inputDatePig=$inputDate -param StartDate =$SDate -param EndDate=$EDate PIG script is as follows eventData = LOAD 'eap-prod://event' USING com.engine.dat

Re: Reading multiple files of a directory using a Single LOAD Command in PIG

2013-06-11 Thread Mix Nin
PigStorage) already handles that. > > Also, can you double check your path is not "/Output/part-m* as opposed to > backward slashes? > > > On Tue, Jun 11, 2013 at 2:26 PM, Mix Nin wrote: > > > I have a directory "Output2. It has file names as below > > &

Reading multiple files of a directory using a Single LOAD Command in PIG

2013-06-11 Thread Mix Nin
I have a directory "Output2. It has file names as below - _SUCCESS part-m-0 part-m-1 part-m-2 part-m-3 . . . . part-m-00100 - The above files are produced by PIG output STORE command . I want to read the files starting with "part-m-" using PIG comm

Single Output file from STORE command

2013-05-24 Thread Mix Nin
PIG STORE command produces multiple output files. I want a single output file and I tried using command as below STORE (foreach (group NoNullData all) generate flatten($1)) into ''; This command produces one single file but at the same time forces to use single reducer which kills performanc

Re: Number of records in an HDFS file

2013-05-13 Thread Mix Nin
as been generated. >> >> Regards, >> Shahab >> >> >> On Mon, May 13, 2013 at 2:16 PM, Mix Nin wrote: >> >>> Ok, let re modify my requirement. I should have specified in the >>> beginning itself. >>> >>> I need to get coun

Re: Number of records in an HDFS file

2013-05-13 Thread Mix Nin
On Mon, May 13, 2013 at 11:37 PM, Mix Nin wrote: > >> It is a text file. >> >> If we want to use wc, we need to copy file from HDFS and then use wc, and >> this may take time. Is there a way without copying file from HDFS to local >> directory? >>

Re: Number of records in an HDFS file

2013-05-13 Thread Mix Nin
pointers. > > what kind of files are we talking about. for text you can use wc , for > avro data files you can use avro-tools. > > or get the job that pig is generating , get the counters for that job from > the jt of your hadoop cluster. > > Thanks, > Rahul > > &

Number of records in an HDFS file

2013-05-13 Thread Mix Nin
Hello, What is the bets way to get the count of records in an HDFS file generated by a PIG script. Thanks

Re: Commands not working properlry when stored in pig file

2013-03-27 Thread Mix Nin
runs fine On Wed, Mar 27, 2013 at 2:40 PM, Mix Nin wrote: > I wrote a pig script as follows and stored it in x.pig file > > Data = LOAD '/' as ( ) > NoNullData= FILTER Data by qe is not null; > STORE (foreach (group NoNullData all) generate flatten($1))

Commands not working properlry when stored in pig file

2013-03-27 Thread Mix Nin
I wrote a pig script as follows and stored it in x.pig file Data = LOAD '/' as ( ) NoNullData= FILTER Data by qe is not null; STORE (foreach (group NoNullData all) generate flatten($1)) into 'exp/$inputDatePig'; evnt_dtl =LOAD 'exp/$inputDatePig/part-r-0' AS (cust,) I execut

[no subject]

2013-03-27 Thread Mix Nin
I wrote a pig script as follows and stored it in x.pig file Data = LOAD '/' as ( ) NoNullData= FILTER Data by qe is not null; STORE (foreach (group NoNullData all) generate flatten($1)) into 'exp/$inputDatePig'; evnt_dtl =LOAD 'exp/$inputDatePig/part-r-0' AS (cust,) I execut

Transpose

2013-03-05 Thread Mix Nin
Hi I have data in a file as follows . There are 3 columns separated by semicolon(;). Each column would have multiple values separated by comma (,). 11,22,33;144,244,344;yny; I need output data in below format. It is like transposing values of each column. 11 144 y 22 244 n 33 344 y Can we wri