Hi,
I'm trying to run the following pig script (it main purpose is to read inputs
that contains info about phone calls, the script suppose to count the different
types of calls and the different subscribers that made them):
SET default_parallel 40;
allFiles = LOAD
'maprfs:///analytics/data/consumers/mapred/facts/done/FACT_VOICE_GE_Analytics9_1/20131114/'
USING PigStorage(',');
allFilesFiltered = FILTER allFiles BY $11 MATCHES '.*On.*' AND $4 > 0;
datesList = FOREACH allFilesFiltered GENERATE SUBSTRING($0, 0, 10) AS day, $11
AS callType, $4 AS amount, $1 AS subscriberKey;
datesGroups = GROUP datesList BY (day, callType);
datesGroupsAmount = foreach datesGroups {
unique_seubscriber = DISTINCT datesList.subscriberKey;
GENERATE group.day, group.callType, COUNT(datesList),
SUM(datesList.amount), COUNT(unique_seubscriber);
};
dump datesGroupsAmount;
the problem is with the unique_seubscriber. The count and distinct doesn't
work. The strange thing is that if I run script separately for each sub
folder's input - the run will succeed for each part, but if I'm giving the
hall inputs folders together it fails and I get the following error:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator
for alias datesGroupsAmount
Another error that I get from time to time (if I'm making small changes in the
script) is:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator
for alias datesGroupsAmount. Backend error : java.lang.Boolean cannot be cast
to org.apache.pig.data.Tuple (myne there is a connection between the two
errors?)
Here is the log file:
Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias datesGroupsAmount
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open
iterator for alias datesGroupsAmount
at org.apache.pig.PigServer.openIterator(PigServer.java:836)
at
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:604)
at org.apache.pig.Main.main(Main.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
at org.apache.pig.PigServer.openIterator(PigServer.java:828)
... 12 more
any help will be appreciate
thanks
Noam
________________________________
This email contains proprietary and/or confidential information of Pontis. If
you have received this email in error, please delete all copies without delay
and do not copy, distribute, or rely on any information contained in this email.