Amit,

That's a lot to look at.  :)  Maybe you can reproduce the error with the
minimum amount of input (one record?) and update this thread with how to
reproduce.  Also minimize the Python code to the bare minimum required to
get the same error.  It might be easier to get volunteers if you make
reproducing and code review easier.

Good luck,
    Steve

On Wed, Apr 13, 2016 at 9:51 PM, Amit Sharma (VAS) <amit.shar...@mtsindia.in
> wrote:

> Somebody please help
>
>
> Best regards
>
> Amit Sharma
>
> From: Amit Sharma
> Sent: Wednesday, April 13, 2016 4:17 PM
> To: 'user@pig.apache.org'
> Subject: PIG: python udf streaming error
>
> Hi,
>
> while calling python udf from pig(local). i am getting following error
> (attempt logs) & in the console getting the error mentioned below. Not able
> to trace where i am doing wrong as same code was working earlier. Using
> RHEL(6.4) on 64 bit machine with 2.7.2 hadoop & 0.15 version pig & python
> 3.5
> Traceback (most recent call last):
> File "/tmp/controller2772959444531928936.py", line 356, in <module>
> sys.argv[5], sys.argv[6], sys.argv[7], sys.argv[8])
> File "/tmp/controller2772959444531928936.py", line 88, in main
> input_str = self.get_next_input()
> File "/tmp/controller2772959444531928936.py", line 164, in get_next_input
> while input_str.endswith(END_RECORD_DELIM) == False:
> TypeError: endswith first arg must be bytes or a tuple of bytes, not str
> Following is the error at console:
> java.lang.Exception: org.apache.pig.impl.streaming.StreamingUDFException:
> LINE : at
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.pig.impl.streaming.StreamingUDFException: LINE : at
> org.apache.pig.impl.builtin.StreamingUDF$ProcessErrorThread.run(StreamingUDF.java:503)
>
> Exception in thread "Thread-35" java.lang.NullPointerException at
> org.apache.pig.impl.builtin.StreamingUDF$ProcessOutputThread.run(StreamingUDF.java:468)
>
>
> Following is the python code:
> @outputSchema('output_field_name:chararray')
> def readfileinlist(filename):
>     with open(filename) as inputfile:
>             lines = inputfile.read().splitlines()
>     return lines
>
> @outputSchema('output_field_name:boolean')
> def intlgtinlist(srcgt,destgt,intgtllist):
>     if srcgt.startswith(tuple(intgtllist)) or
> destgt.startswith(tuple(intgtllist)):
>             return True
>     else:
>             return False
>
> @outputSchema('output_field_name:boolean')
> def checkintlgtincdrs(aparty,srcgt,destgt):
>     intgtllist = []
>     try:
>             if( (len(srcgt) > 0 or len(destgt) > 0) and (srcgt or destgt)
> and aparty.isdigit()):
>                     if os.path.isfile(INTERNATIONALGTPATH) and
> os.access(INTERNATIONALGTPATH, os.R_OK) and
> os.stat(INTERNATIONALGTPATH).st_size > 0:
>
>                             #FUNCTION FOR READING THE FILE IN ARRAY/TUPLE
>                             intgtllist =
> readfileinlist(INTERNATIONALGTPATH)
>
>                             #CHECK FOR THE INPUT(ARG0) IN ARRAY/TUPLE
>                             if intlgtinlist(srcgt,destgt,intgtllist):
>                                     return True
>                             else:
>                                     return False
>                     else:
>                             return False
>             else:
>                     return False
>     except OSError or IndexError:
>             pass
>
>     return True
>
>
> Following is the pig script
>
>  record = LOAD '/inreport/cdrs/ZTE_20160301*' USING
> PigStorage('|','-tagFile');
>
>  REGISTER 'udf_smsiuc.py' using streaming_python as smsiucudfs;
>
>  internationalcdrsfilter = FILTER record by
> smsiucudfs.checkintlgtincdrs($1,$26,$27);
>
> \
>
> Best regards
>
> Amit Sharma
>
>
> ________________________________
>
> This E-Mail may contain Confidential and/or legally privileged Information
> and is meant for the intended recipient(s) only. If you have received this
> e-mail in error and are not the intended recipient/s, Kindly notify the
> sender and then delete this e-mail immediately from your system. You are
> also hereby notified that any use, any form of reproduction, dissemination,
> copying, disclosure, modification, distribution and/or publication of this
> e-mail, its contents or its attachment/s other than by its intended
> recipient/s is strictly prohibited and may be unlawful.
>
> Internet Communications cannot be guaranteed to be secure or error-free as
> information could be delayed, intercepted, corrupted, lost, or contain
> viruses. Sistema Shyam Teleservices Limited does not accept any liability
> for any errors, omissions, viruses or computer problems experienced by any
> recipient as a result of this e-mail.
>

Reply via email to