Re: Pig script to load C++ library

2015-12-13 Thread inelu nagamallikarjuna
Hi Shashikant.

Pig supports streaming with help of *stream *operator. This allows invoking
any external executables i.e., perl, python, c++, php etc... inside your
pig scripts.

Thanks
Naga

On Mon, Dec 14, 2015 at 12:18 PM, Shashikant K <
shashikant.kulkarn...@gmail.com> wrote:

> Hi All,
>
> Here is what I am trying to do.
>
>- I have a C++ library which I use to load from PHP script using PHP's
>exec() function. It works perfect. PHP sends the input parameter and
> gets
>the result in output parameter.
>- I want to make use of the same C++ library in my Pig Script and
>perform the same operation. Is there any way to do this?
>- Can you point me to some documentation, because I tried to find it.
>
>
> Thanks in advance.
>
> Regards,
> Shashikant
>



-- 
Thanks and Regards
Nagamallikarjuna


Re: create a pipeline

2015-04-15 Thread inelu nagamallikarjuna
Hi,

use work flow manager Oozie, to create the work flow (DAG of jobs i.e pig
scripts).


Thanks
Nagamallikarjuna

On Wed, Apr 15, 2015 at 1:46 PM, pth001 patcharee.thong...@uni.no wrote:

 Hi,

 How can I create a pipeline (containing a sequence of pig scripts)?

 BR,
 Patcharee




-- 
Thanks and Regards
Nagamallikarjuna


Re: ClassNotFoundException while running pig in local mode

2014-12-26 Thread inelu nagamallikarjuna
Hi,

Add all the required jars to the PIG CLASSPATH variable, It will resolve
the issue.

Thanks
Naga

On Fri, Dec 26, 2014 at 3:06 PM, Venkat Ramakrishnan 
venkat.archit...@gmail.com wrote:

 Thanks Praveen. I am running pig-14 on Windows 7.

 Can anyone confirm if Hadoop is really required for Pig local?
 If not, should I file an enhancement request?

 Thx,
 Venkat.


 On Fri, Dec 26, 2014 at 2:51 PM, Praveen R prav...@sigmoidanalytics.com
 wrote:

  I usually have hadoop configured on the system even when using pig in
 local
  mode and don't remember running pig without hadoop.
 
  It could be working on versions pig-13 or prior since it used to ship all
  hadoop jars along with the release, but with pig-14 hadoop jars are no
  longer shipped (believe this is to have a lighter packaging).
 
  Regards,
  Praveen
 
  On Fri, Dec 26, 2014 at 2:36 PM, Venkat Ramakrishnan 
  venkat.archit...@gmail.com wrote:
 
   Thanks Praveen. Is Hadoop required for running pig local ?
   I read in a couple of places on the web saying that hadoop
   is not required for local mode...
  
   - Venkat.
  
   On Fri, Dec 26, 2014 at 1:44 PM, Praveen R 
 prav...@sigmoidanalytics.com
  
   wrote:
  
Looks like pig isn't able to find the hadoop jars. Could you try
  putting
hadoop on the system i.e. have hadoop command in the environment
 path.
   
Regards,
Praveen
   
On Thu, Dec 25, 2014 at 4:17 PM, Venkat Ramakrishnan 
venkat.archit...@gmail.com wrote:
   
 Hi all,

 I am getting the following error while running pig in local mode
 (pig
   -X
 local) :

 The system cannot find the path specified.
 java.lang.NoClassDefFoundError:
 org/apache/commons/logging/LogFactory
 at org.apache.pig.Main.clinit(Main.java:106)
 Caused by: java.lang.ClassNotFoundException:
 org.apache.commons.logging.LogFactory
 at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 ... 1 more
 Exception in thread Thread-0 java.lang.NoClassDefFoundError:
 org/apache/hadoop/fs/LocalFileSystem
 at org.apache.pig.Main$1.run(Main.java:101)
 Caused by: java.lang.ClassNotFoundException:
 org.apache.hadoop.fs.LocalFileSystem
 at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 ... 1 more
 Exception in thread main


 Can someone tell me how to resolve this?

 Thanks,
 Venkat.

   
  
 




-- 
Thanks and Regards
Nagamallikarjuna


Re: ToDate and GetMonth function help

2014-08-18 Thread inelu nagamallikarjuna
Hai,

Write UDF in java to extract the month and any other values from your input
string

Thanks
Naga
On Aug 18, 2014 8:49 PM, murali krishna p muralikrishna.par...@icloud.com
wrote:



 Trying to read a table column defined as datetime in my pig script as
 follows

 load ‘/tmp.psv’ using PIgStore() (open_dte : chararray);


 Later I wanted to use GetMonth in pig script as followes.

 Temp_dt = ToDate(open_dte, ‘-MM-DD’);
 Month = GetMonth(temp_dt);


 I am getting an error asking to use a explicit cast. Any insights in this
 issue?

 Greatly appreciate your help!!


 Thanks,
 Murali




Re: Query on Pig

2014-07-10 Thread inelu nagamallikarjuna
Hi,

We are calling external map reduce program inside our pig script to perform
a specific task. Lets take the example crawling process.

-- Load the all seed urls into the relation crawldata.

*crawldata = load 'baseurls' using PigStorage( pageid: chararray,
pageurl:chararray)*
normalizedata = foreach crawldata generate pageid, normalize(pageurl)

--In the above url list, we have good urls and bad/blocklisted urls. we
need to filter the block listed urls. To filter these block listed, we have
a java map reduce program *blocklisturls*.*jar*. So instead of writing
pig latin statement to filter this block listed urls, we will this java map
reduce program as below.

goodurls = *mapreduce blocklisturls*.*jar*
*store *normalizedata *into '/path/input'*
*load '/path/output' as (pageid:chararray,
pageurl:chararray)*

In the above pig latin statement is a sequence of steps:
1. store will write normalizedata into HDFS under the path of '/path/input.
2. blocklisturls java map reduce program is called on input '/path/input',
process and filters block listed urls, then write the output into HDFS
under the path of '/path/output'
3. load operator will load the data from HDFS (/path/output) into goodurls
relation

Thanks
Nagamallikarjuna




On Thu, Jul 10, 2014 at 4:42 PM, Nivetha K nivethak3...@gmail.com wrote:

 Hi,

Thanks for replying.Can you please explain how mapreduce operator works
 in pig
 On 5 July 2014 10:35, Darpan R darpa...@gmail.com wrote:

  Looks like Classpath problem :java.lang.RuntimeException:
  java.lang.ClassNotFoundException:
  Class
  WordCount$Map not found
 
  Can you make sure your jar is in the class path ?
 
 
  On 4 July 2014 11:19, Nivetha K nivethak3...@gmail.com wrote:
 
   Hi,
  
I am currently working with Pig. I get struck with following
 script.
   A = load 'sample.txt';
   B = MAPREDUCE '/home/training/simp.jar' Store A into 'inputDir' Load
   'outputDir' as (word:chararray, count: int) `WordCount inputDir
  outputDir`;
   dump B;
  
  
   Error :
  
   2014-07-04 11:17:57,811 [main] WARN
  org.apache.hadoop.mapred.JobClient -
   No job jar file set.  User classes may not be found. See JobConf(Class)
  or
   JobConf#setJar(String).
   2014-07-04 11:18:16,313 [main] INFO
  org.apache.hadoop.mapred.JobClient -
   Task Id : attempt_201407011531_0147_m_00_2, Status : FAILED
   java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
   WordCount$Map not found
   at
   org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1774)
   at
  
  
 
 org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:191)
   at
  org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:631)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at
  
  
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
   at org.apache.hadoop.mapred.Child.main(Child.java:262)
   Caused by: java.lang.ClassNotFoundException: Class WordCount$Map not
  found
   at
  
  
 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1680)
   at
   org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1772)
  
  
  
   please help me to solve the problem
  
  
   regards,
  
   Nivetha.
  
 




-- 
Thanks and Regards
Nagamallikarjuna


Re: Adding days to Pig

2013-12-14 Thread inelu nagamallikarjuna
Hi

Write a UDF, it takes date and no of days to add and returns the date

Thanks
Naga
On Dec 14, 2013 6:19 AM, Krishnan Narayanan krishnan.sm...@gmail.com
wrote:

 Hi All ,

 I am trying to do something like (get_date +46 days) , how to achieve this
 in pig.

 I am using pig 0.10
 help much appreciated.

 Thanks
 Krishnan



Re: Simple word count in pig..

2013-11-20 Thread inelu nagamallikarjuna
Hai,

 Please go through the following code,

Input Data:
---
DocNameTokens
--
cricketsachin,sehwag,dravid,dhoni
movieamir,salman,hruthik,ranveer
cricketsachin,ganguly,rohit,dhoni
cricketsehwag,sachin,dravid,kohli
moviesalman,amir,sharukh

===
Pig UDF

package com.pig.udf;

import java.io.IOException;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;

import org.apache.pig.EvalFunc;
import org.apache.pig.data.DataBag;
import org.apache.pig.data.Tuple;

public class WordBag extends EvalFuncString {

@Override
public String exec(Tuple input) throws IOException {
if (input == null || input.size() == 0) {
return null;
}
DataBag myBag = (DataBag) input.get(0);
String frequency = ;
IteratorTuple itr = myBag.iterator();
Tuple tuple = null;
MapString, Integer wordcount = new HashMapString, Integer();
while (itr.hasNext()) {
tuple = itr.next();
DataBag tokens = (DataBag) tuple.get(0);
IteratorTuple it = tokens.iterator();
while(it.hasNext())
{
tuple = it.next();
String token = (String) tuple.get(0);
if (wordcount.containsKey(token)) {
int count = wordcount.get(token);
count++;
wordcount.put(token, count);
} else {
wordcount.put(token, 1);
}
}
}
SetString keys = wordcount.keySet();
for (String key : keys) {
frequency = frequency +   + key + : + wordcount.get(key);
}
return frequency;
}
}

Build a jar for the above UDF and add it to pig script;


PigScript:
--
register /home/hadoopz/naga/bigdata/pig-0.10.0/pigscripts/wordbag.jar
news = load '/pig/news' using PigStorage() as (doc:chararray,
content:chararray);
words = foreach news generate doc, TOKENIZE(content, ',') as mywords;
describe words;
wordcount = foreach grpwords generate group,
com.pig.udf.WordBag(words.mywords);
dump wordcount;

==
Output
--
docNameTokens and their Frequency
--
(movie, sharukh:1 salman:2 ranveer:1 hruthik:1 amir:2)
(cricket, sehwag:2 kohli:1 rohit:1 ganguly:1 sachin:3 dhoni:2 dravid:2)


On Wed, Nov 20, 2013 at 5:15 AM, jamal sasha jamalsha...@gmail.com wrote:

 Hi,

 I have data already processed in following form:


 ( id ,{ bag of words})
 So for example:

 (foobar, {(foo), (foo),(foobar),(bar)})
 (foo,{(bar),(bar)})

 and so on..
 describe processed gives me:
 processed: {id: chararray,tokens: {tuple_of_tokens: (token: chararray)}}


 Now what I want is.. also count the number of times a word appears in this
 data and output it as
 foobar, foo, 2
 foobar,foobar,1
 foobar,bar,1
 foo,bar,2

 and so on...

 How do I do this in pig?




-- 
Thanks and Regards
Nagamallikarjuna


Re: UDF to calculate Average of whole dataset

2013-03-05 Thread inelu nagamallikarjuna
Hi,

Use the fully qualified class name like org.apache.udf.myudf.udfName in the
pig script while using udf.
Otherwise use only udf name in the script and while running use like pig -
Dudf.import.list=org.apache.udf.myudf.evaluation.string scriptname.pig


Thanks
Nagamallikarjuna

On Wed, Mar 6, 2013 at 2:54 AM, Preeti Gupta preetigupt...@gmail.comwrote:

 Nope. It does not work

 2013-03-05 13:22:28,768 [main] ERROR org.apache.pig.tools.grunt.Grunt -
 ERROR 1070: Could not resolve myudf.CalculateAvg using imports: [,
 org.apache.pig.builtin., org.apache.pig.impl.builtin.]
 Details at logfile:
 /Users/PreetiGupta/Documents/CMPS290S/project/pig_1362518535200.log
 ~

 Pig script

 REGISTER ./myudfs.jar;
 dividends = load 'myfile' as (A);
 dump dividends
 --grouped   = filter dividends by A-1000.0;
 --avg   = foreach (filter dividends by A-1000.0) generate AVG(A);
 avg = foreach (group dividends all) generate myudf.CalculateAvg(dividends);
 dump avg

 My jar file

 bash-3.2# vi a.txt

  0 Mon Mar 04 13:45:44 PST 2013 META-INF/
 60 Mon Mar 04 13:45:44 PST 2013 META-INF/MANIFEST.MF
   1190 Mon Mar 04 13:45:16 PST 2013 CalculateAvg$Final.class
   1306 Mon Mar 04 13:45:16 PST 2013 CalculateAvg$Initial.class
   1477 Mon Mar 04 13:45:16 PST 2013 CalculateAvg$Intermediate.class
   4205 Mon Mar 04 13:45:16 PST 2013 CalculateAvg.class
 ~

 On Mar 5, 2013, at 1:09 PM, pablomar pablo.daniel.marti...@gmail.com
 wrote:

  did you try with {jarFileName}.{FunctionName} ?
  example: myudfs.CalculateAvg ?
 
 
  On Tue, Mar 5, 2013 at 4:04 PM, Preeti Gupta preetigupt...@gmail.com
 wrote:
 
  I kept the code in myudfs.jar and my pig script is point to it using
  register command but the script is not able to find CalculateAvg
 function.
  I don't have any packages defined in the java file and the jar is my
  current directory.
 
 
  On Mar 5, 2013, at 3:17 AM, Jonathan Coveney jcove...@gmail.com
 wrote:
 
  dividends = load 'try.txt'
  a = foreach dividends generate FLATTEN(TOBAG(*));
  b = foreach (group a all) generate CalculateAvg($1);
 
  I think that should work
 
 
  2013/3/5 pablomar pablo.daniel.marti...@gmail.com
 
  what is the error ?
  function not found or something like that ?
 
  what about this ?
  avg   = generate myudfs.CalculateAvg(dividends);
 
 
  On Mon, Mar 4, 2013 at 4:56 PM, Preeti Gupta 
  preetigupt...@soe.ucsc.edu
  wrote:
 
  Hello All,
 
  I have dataset like
 
  0, 10.1, 20.1, 30, 40,
  50, 60, 70, 80.1, 1,
  2, 3, 4, 5, 6,
  7, 8, 9, 10, 11,
  12, 13, 14, 15, 16,
  1, 2, 3, 4, 5,
  56, 6, 7, 8, 9,
  9, 9, 9, 12, 1,
  3, 14, 1, 5, 6,
  7, 8, 8, 9, 12
 
  So basically comma separated values. But I want to consider this as
 one
  data column and I want to calculate the average of the whole dataset.
 
  I believe I have to write UDF to calculate average. Pig is able to
 load
  this data
 
  (  0, 10.1, 20.1, 30, 40,)
  (  50, 60, 70, 80.1, 1,)
  (  2, 3, 4, 5, 6,)
  (  7, 8, 9, 10, 11,)
  (  12, 13, 14, 15, 16,)
  (  1, 2, 3, 4, 5,)
  (  56, 6, 7, 8, 9,)
  (  9, 9, 9, 12, 1,)
  (  3, 14, 1, 5, 6,)
  (  7, 8, 8, 9, 12 )
 
  and How do I invoke that UDF in my pig script? Say I implement
  CalculateAvg function.
 
  REGISTER ./myudfs.jar
  dividends = load 'try.txt';
  dump dividends
  --grouped   = group dividends by symbol;
  avg   = generate CalculateAvg(dividends);
  dump avg
  --store avg into 'average_dividend';
 
  It fails.
 
 
 
 
 




-- 
Thanks and Regards
Nagamallikarjuna


Re: Error during parsing

2013-03-05 Thread inelu nagamallikarjuna
Hi,

There is a small mistake in your script. You used relation name called data
in second line use X instead of data.

*Sample script:

X= LOAD '/streamming/read' AS (line : chararray);
Y = foreach X generate STRSPLIT(line,' ');
dump Y;*

Thanks
Nagamallikarjuna

On Wed, Mar 6, 2013 at 4:19 AM, Mix Nin pig.mi...@gmail.com wrote:

 Hi,

 I executed below PIG commands.

  X= LOAD '/user/lnindrakrishna/input/ExpTag.txt'  AS (line:chararray);
  Y=foreach data  { generate STRSPLIT(line,',') ;};


 And I get below error. What is wrong in my script. I tried removing flower
 braces. giving extra spaces. But nothing worked

 2013-03-05 15:38:57,124 [main] ERROR org.apache.pig.tools.grunt.Grunt -
 ERROR 1000: Error during parsing. Encountered  PATH Y=foreach  at
 line 2, column 1.
 Was expecting one of:
 EOF
 cat ...
 fs ...
 cd ...
 cp ...
 copyFromLocal ...
 copyToLocal ...
 dump ...
 describe ...
 aliases ...
 explain ...
 help ...
 kill ...
 ls ...
 mv ...
 mkdir ...
 pwd ...
 quit ...
 register ...
 rm ...
 rmf ...
 set ...
 illustrate ...
 run ...
 exec ...
 scriptDone ...
  ...
 EOL ...
 ; ...




-- 
Thanks and Regards
Nagamallikarjuna


Re: Error during parsing

2013-03-05 Thread inelu nagamallikarjuna
Hi,

Please paste your pig script here..

Thanks
Nagamallikarjuna

On Wed, Mar 6, 2013 at 4:39 AM, Mix Nin pig.mi...@gmail.com wrote:

 Thanks for the reply. Now I get below error:

  ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve
 STRSPLIT using imports: [, org.apache.pig.builtin.,
 org.apache.pig.impl.builtin.


 On Tue, Mar 5, 2013 at 3:07 PM, inelu nagamallikarjuna
 malli3...@gmail.comwrote:

  Hi,
 
  There is a small mistake in your script. You used relation name called
 data
  in second line use X instead of data.
 
  *Sample script:
 
  X= LOAD '/streamming/read' AS (line : chararray);
  Y = foreach X generate STRSPLIT(line,' ');
  dump Y;*
 
  Thanks
  Nagamallikarjuna
 
  On Wed, Mar 6, 2013 at 4:19 AM, Mix Nin pig.mi...@gmail.com wrote:
 
   Hi,
  
   I executed below PIG commands.
  
X= LOAD '/user/lnindrakrishna/input/ExpTag.txt'  AS (line:chararray);
Y=foreach data  { generate STRSPLIT(line,',') ;};
  
  
   And I get below error. What is wrong in my script. I tried removing
  flower
   braces. giving extra spaces. But nothing worked
  
   2013-03-05 15:38:57,124 [main] ERROR org.apache.pig.tools.grunt.Grunt -
   ERROR 1000: Error during parsing. Encountered  PATH Y=foreach  at
   line 2, column 1.
   Was expecting one of:
   EOF
   cat ...
   fs ...
   cd ...
   cp ...
   copyFromLocal ...
   copyToLocal ...
   dump ...
   describe ...
   aliases ...
   explain ...
   help ...
   kill ...
   ls ...
   mv ...
   mkdir ...
   pwd ...
   quit ...
   register ...
   rm ...
   rmf ...
   set ...
   illustrate ...
   run ...
   exec ...
   scriptDone ...
...
   EOL ...
   ; ...
  
 
 
 
  --
  Thanks and Regards
  Nagamallikarjuna
 




-- 
Thanks and Regards
Nagamallikarjuna


Re: Error during parsing

2013-03-05 Thread inelu nagamallikarjuna
Hi,

strspit is a builtin function, so the register command is not required.
use same script by removing the first line. I already tested the script
against pig-0.10.0 version it is working fine.

Thanks
Nagamallikarjuna

On Wed, Mar 6, 2013 at 4:46 AM, Mix Nin pig.mi...@gmail.com wrote:

 Below is my script


 REGISTER '/home/hadoop/lib/piggybank-0.7.0.jar';

 X= LOAD '/user/lnindrakrishna/input/ExpTag.txt'  AS (line:chararray);
 Y =foreach X  generate STRSPLIT(line,',') ;

 Thanks


 On Tue, Mar 5, 2013 at 3:14 PM, Harsha har...@defun.org wrote:

  Hi Mix,
 there is a additional ;
  Y=foreach data { generate STRSPLIT(line,',') ;};
  Just before closing }
 
  --
  Harsha
 
 
  On Tuesday, March 5, 2013 at 2:49 PM, Mix Nin wrote:
 
   Hi,
  
   I executed below PIG commands.
  
   X= LOAD '/user/lnindrakrishna/input/ExpTag.txt' AS (line:chararray);
   Y=foreach data { generate STRSPLIT(line,',') ;};
  
  
   And I get below error. What is wrong in my script. I tried removing
  flower
   braces. giving extra spaces. But nothing worked
  
   2013-03-05 15:38:57,124 [main] ERROR org.apache.pig.tools.grunt.Grunt -
   ERROR 1000: Error during parsing. Encountered  PATH Y=foreach  at
   line 2, column 1.
   Was expecting one of:
   EOF
   cat ...
   fs ...
   cd ...
   cp ...
   copyFromLocal ...
   copyToLocal ...
   dump ...
   describe ...
   aliases ...
   explain ...
   help ...
   kill ...
   ls ...
   mv ...
   mkdir ...
   pwd ...
   quit ...
   register ...
   rm ...
   rmf ...
   set ...
   illustrate ...
   run ...
   exec ...
   scriptDone ...
...
   EOL ...
   ; ...
  
  
 
 
 




-- 
Thanks and Regards
Nagamallikarjuna


Re: Error during parsing

2013-03-05 Thread inelu nagamallikarjuna
Hi,

This is the command *pig -version* in Linux shell.

Thanks
Nagamallikarjuna

On Wed, Mar 6, 2013 at 4:56 AM, Mix Nin pig.mi...@gmail.com wrote:

 I checked by removing REGISTER command, but still I get the error. How do I
 check the PIG version?


 On Tue, Mar 5, 2013 at 3:22 PM, inelu nagamallikarjuna
 malli3...@gmail.comwrote:

  Hi,
 
  strspit is a builtin function, so the register command is not required.
  use same script by removing the first line. I already tested the script
  against pig-0.10.0 version it is working fine.
 
  Thanks
  Nagamallikarjuna
 
  On Wed, Mar 6, 2013 at 4:46 AM, Mix Nin pig.mi...@gmail.com wrote:
 
   Below is my script
  
  
   REGISTER '/home/hadoop/lib/piggybank-0.7.0.jar';
  
   X= LOAD '/user/lnindrakrishna/input/ExpTag.txt'  AS (line:chararray);
   Y =foreach X  generate STRSPLIT(line,',') ;
  
   Thanks
  
  
   On Tue, Mar 5, 2013 at 3:14 PM, Harsha har...@defun.org wrote:
  
Hi Mix,
   there is a additional ;
Y=foreach data { generate STRSPLIT(line,',') ;};
Just before closing }
   
--
Harsha
   
   
On Tuesday, March 5, 2013 at 2:49 PM, Mix Nin wrote:
   
 Hi,

 I executed below PIG commands.

 X= LOAD '/user/lnindrakrishna/input/ExpTag.txt' AS
 (line:chararray);
 Y=foreach data { generate STRSPLIT(line,',') ;};


 And I get below error. What is wrong in my script. I tried removing
flower
 braces. giving extra spaces. But nothing worked

 2013-03-05 15:38:57,124 [main] ERROR
  org.apache.pig.tools.grunt.Grunt -
 ERROR 1000: Error during parsing. Encountered  PATH Y=foreach
 
  at
 line 2, column 1.
 Was expecting one of:
 EOF
 cat ...
 fs ...
 cd ...
 cp ...
 copyFromLocal ...
 copyToLocal ...
 dump ...
 describe ...
 aliases ...
 explain ...
 help ...
 kill ...
 ls ...
 mv ...
 mkdir ...
 pwd ...
 quit ...
 register ...
 rm ...
 rmf ...
 set ...
 illustrate ...
 run ...
 exec ...
 scriptDone ...
  ...
 EOL ...
 ; ...


   
   
   
  
 
 
 
  --
  Thanks and Regards
  Nagamallikarjuna
 




-- 
Thanks and Regards
Nagamallikarjuna


Re: Error during parsing

2013-03-05 Thread inelu nagamallikarjuna
Hi,


The function STRSPLIT is not there in the list of in built fuction of
hive-0.7.0. Please use any version from 0.8.0 on words. There are lots of
improvements from 0.7.0 to 0.10.0.


Thanks
Nagamallikarjuna

On Wed, Mar 6, 2013 at 4:58 AM, inelu nagamallikarjuna
malli3...@gmail.comwrote:

 Hi,

 This is the command *pig -version* in Linux shell.

 Thanks
 Nagamallikarjuna


 On Wed, Mar 6, 2013 at 4:56 AM, Mix Nin pig.mi...@gmail.com wrote:

 I checked by removing REGISTER command, but still I get the error. How do
 I
 check the PIG version?


 On Tue, Mar 5, 2013 at 3:22 PM, inelu nagamallikarjuna
 malli3...@gmail.comwrote:

  Hi,
 
  strspit is a builtin function, so the register command is not required.
  use same script by removing the first line. I already tested the script
  against pig-0.10.0 version it is working fine.
 
  Thanks
  Nagamallikarjuna
 
  On Wed, Mar 6, 2013 at 4:46 AM, Mix Nin pig.mi...@gmail.com wrote:
 
   Below is my script
  
  
   REGISTER '/home/hadoop/lib/piggybank-0.7.0.jar';
  
   X= LOAD '/user/lnindrakrishna/input/ExpTag.txt'  AS (line:chararray);
   Y =foreach X  generate STRSPLIT(line,',') ;
  
   Thanks
  
  
   On Tue, Mar 5, 2013 at 3:14 PM, Harsha har...@defun.org wrote:
  
Hi Mix,
   there is a additional ;
Y=foreach data { generate STRSPLIT(line,',') ;};
Just before closing }
   
--
Harsha
   
   
On Tuesday, March 5, 2013 at 2:49 PM, Mix Nin wrote:
   
 Hi,

 I executed below PIG commands.

 X= LOAD '/user/lnindrakrishna/input/ExpTag.txt' AS
 (line:chararray);
 Y=foreach data { generate STRSPLIT(line,',') ;};


 And I get below error. What is wrong in my script. I tried
 removing
flower
 braces. giving extra spaces. But nothing worked

 2013-03-05 15:38:57,124 [main] ERROR
  org.apache.pig.tools.grunt.Grunt -
 ERROR 1000: Error during parsing. Encountered  PATH Y=foreach
 
  at
 line 2, column 1.
 Was expecting one of:
 EOF
 cat ...
 fs ...
 cd ...
 cp ...
 copyFromLocal ...
 copyToLocal ...
 dump ...
 describe ...
 aliases ...
 explain ...
 help ...
 kill ...
 ls ...
 mv ...
 mkdir ...
 pwd ...
 quit ...
 register ...
 rm ...
 rmf ...
 set ...
 illustrate ...
 run ...
 exec ...
 scriptDone ...
  ...
 EOL ...
 ; ...


   
   
   
  
 
 
 
  --
  Thanks and Regards
  Nagamallikarjuna
 




 --
 Thanks and Regards
 Nagamallikarjuna




-- 
Thanks and Regards
Nagamallikarjuna


Re: Error during parsing

2013-03-05 Thread inelu nagamallikarjuna
Hi,

I think it is better to download the latest stable version or otherwise
write your own udf for split functionality.

Thanks
Nagamallikarjuna

On Wed, Mar 6, 2013 at 5:04 AM, Mix Nin pig.mi...@gmail.com wrote:

 Below is my PIG version

 Apache Pig version 0.7.1-wilma-3

 How do I use higher version of script.


 On Tue, Mar 5, 2013 at 3:32 PM, inelu nagamallikarjuna
 malli3...@gmail.comwrote:

  Hi,
 
 
  The function STRSPLIT is not there in the list of in built fuction of
  hive-0.7.0. Please use any version from 0.8.0 on words. There are lots of
  improvements from 0.7.0 to 0.10.0.
 
 
  Thanks
  Nagamallikarjuna
 
  On Wed, Mar 6, 2013 at 4:58 AM, inelu nagamallikarjuna
  malli3...@gmail.comwrote:
 
   Hi,
  
   This is the command *pig -version* in Linux shell.
  
   Thanks
   Nagamallikarjuna
  
  
   On Wed, Mar 6, 2013 at 4:56 AM, Mix Nin pig.mi...@gmail.com wrote:
  
   I checked by removing REGISTER command, but still I get the error. How
  do
   I
   check the PIG version?
  
  
   On Tue, Mar 5, 2013 at 3:22 PM, inelu nagamallikarjuna
   malli3...@gmail.comwrote:
  
Hi,
   
strspit is a builtin function, so the register command is not
  required.
use same script by removing the first line. I already tested the
  script
against pig-0.10.0 version it is working fine.
   
Thanks
Nagamallikarjuna
   
On Wed, Mar 6, 2013 at 4:46 AM, Mix Nin pig.mi...@gmail.com
 wrote:
   
 Below is my script


 REGISTER '/home/hadoop/lib/piggybank-0.7.0.jar';

 X= LOAD '/user/lnindrakrishna/input/ExpTag.txt'  AS
  (line:chararray);
 Y =foreach X  generate STRSPLIT(line,',') ;

 Thanks


 On Tue, Mar 5, 2013 at 3:14 PM, Harsha har...@defun.org wrote:

  Hi Mix,
 there is a additional ;
  Y=foreach data { generate STRSPLIT(line,',') ;};
  Just before closing }
 
  --
  Harsha
 
 
  On Tuesday, March 5, 2013 at 2:49 PM, Mix Nin wrote:
 
   Hi,
  
   I executed below PIG commands.
  
   X= LOAD '/user/lnindrakrishna/input/ExpTag.txt' AS
   (line:chararray);
   Y=foreach data { generate STRSPLIT(line,',') ;};
  
  
   And I get below error. What is wrong in my script. I tried
   removing
  flower
   braces. giving extra spaces. But nothing worked
  
   2013-03-05 15:38:57,124 [main] ERROR
org.apache.pig.tools.grunt.Grunt -
   ERROR 1000: Error during parsing. Encountered  PATH
  Y=foreach
   
at
   line 2, column 1.
   Was expecting one of:
   EOF
   cat ...
   fs ...
   cd ...
   cp ...
   copyFromLocal ...
   copyToLocal ...
   dump ...
   describe ...
   aliases ...
   explain ...
   help ...
   kill ...
   ls ...
   mv ...
   mkdir ...
   pwd ...
   quit ...
   register ...
   rm ...
   rmf ...
   set ...
   illustrate ...
   run ...
   exec ...
   scriptDone ...
...
   EOL ...
   ; ...
  
  
 
 
 

   
   
   
--
Thanks and Regards
Nagamallikarjuna
   
  
  
  
  
   --
   Thanks and Regards
   Nagamallikarjuna
  
 
 
 
  --
  Thanks and Regards
  Nagamallikarjuna
 




-- 
Thanks and Regards
Nagamallikarjuna


Re: Is there a way to limit the number of maps produced by HBaseStorage ?

2013-01-21 Thread inelu nagamallikarjuna
Hi Vincent,

You can restrict the number of concurrent maps by setting this
parameter *mapred.tasktracker.map.tasks.maximum
= 1 or 2*.



*Thanks
Nagamallikarjuna*

On Mon, Jan 21, 2013 at 7:13 PM, Mohammad Tariq donta...@gmail.com wrote:

 Hello Vincent,

  The number of map tasks for a job is primarily governed by the
 InputSplits and the InputFormat you are using. So setting it through a
 config parameter doesn't guarantee that your job would have the specified
 number of map tasks. However, you can give it a try by using set
 mapred.map.tasks=n in your PigLatin job.

 Warm Regards,
 Tariq
 https://mtariq.jux.com/
 cloudfront.blogspot.com


 On Mon, Jan 21, 2013 at 6:57 PM, Vincent Barat vincent.ba...@gmail.com
 wrote:

  Hi,
 
  We are using HBaseStorage intensively to load data from tables having
 more
  than 100 regions.
 
  HBaseStorage generates 1 map par region, and our cluster having 50 map
  slots, it happens that our PIG scripts start 50 maps reading concurrently
  data from HBase.
 
  The problem is that our HBase cluster has only 10 nodes, and thus the
 maps
  overload it (5 intensive readers per node is too much to bare).
 
  So question: is there a way to say to PIG : limit the nb of maps to this
  maximum (ex: 10) ?
  If not, how can I patch the code to do this ?
 
  Thanks a lot for your help
 


 Warm Regards,
 Tariq
 https://mtariq.jux.com/
 cloudfront.blogspot.com


 On Mon, Jan 21, 2013 at 6:57 PM, Vincent Barat vincent.ba...@gmail.com
 wrote:

  Hi,
 
  We are using HBaseStorage intensively to load data from tables having
 more
  than 100 regions.
 
  HBaseStorage generates 1 map par region, and our cluster having 50 map
  slots, it happens that our PIG scripts start 50 maps reading concurrently
  data from HBase.
 
  The problem is that our HBase cluster has only 10 nodes, and thus the
 maps
  overload it (5 intensive readers per node is too much to bare).
 
  So question: is there a way to say to PIG : limit the nb of maps to this
  maximum (ex: 10) ?
  If not, how can I patch the code to do this ?
 
  Thanks a lot for your help
 




-- 
Thanks and Regards
Nagamallikarjuna


Re: [ANNOUNCE] Welcome new Apache Pig Committers Rohini Palaniswamy

2012-11-01 Thread inelu nagamallikarjuna
Congrats Rohini..

On Thu, Nov 1, 2012 at 10:13 AM, Aniket Mokashi aniket...@gmail.com wrote:

 Congrats Rohini...


 On Mon, Oct 29, 2012 at 11:31 AM, Julien Le Dem jul...@twitter.com
 wrote:

  Congrats Rohini !
 
 
  On Sun, Oct 28, 2012 at 9:42 AM, Bill Graham billgra...@gmail.com
 wrote:
   Congrats Rohini! Great news indeed.
  
   On Saturday, October 27, 2012, Jon Coveney wrote:
  
   Wonderful news!
  
   On Oct 26, 2012, at 9:51 PM, Gianmarco De Francisci Morales 
   g...@apache.org javascript:; wrote:
  
Congratulations Rohini!
Welcome onboard :)
--
Gianmarco
   
   
On Fri, Oct 26, 2012 at 7:32 PM, Prasanth J 
  buckeye.prasa...@gmail.comjavascript:;
   wrote:
Congrats Rohini!
   
Thanks
-- Prasanth
   
On Oct 26, 2012, at 10:21 PM, Santhosh Srinivasan 
   santhosh_mut...@yahoo.com javascript:; wrote:
   
Congrats Rohini! Full speed ahead now :)
   
On Oct 26, 2012, at 4:37 PM, Daniel Dai da...@hortonworks.com
  javascript:;
   wrote:
   
Here is another Pig committer announcement today. Please welcome
Rohini Palaniswamy to be a Pig committer!
   
Thanks,
Daniel
   
  
  
  
   --
   Sent from Gmail Mobile
 



 --
 ...:::Aniket:::... Quetzalco@tl




-- 
Thanks and Regards
Nagamallikarjuna