Chunxi Zhang created SPARK-6931: ----------------------------------- Summary: python: struct.pack('!q', value) in write_long(value, stream) in serializers.py require int(but doesn't raise exceptions in common cases) Key: SPARK-6931 URL: https://issues.apache.org/jira/browse/SPARK-6931 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.3.0 Reporter: Chunxi Zhang Priority: Critical
when I map my own feature calculation module's function, sparks raises: Traceback (most recent call last): File "/usr/local/Cellar/apache-spark/1.3.0/libexec/python/pyspark/daemon.py", line 162, in manager code = worker(sock) File "/usr/local/Cellar/apache-spark/1.3.0/libexec/python/pyspark/daemon.py", line 60, in worker worker_main(infile, outfile) File "/usr/local/Cellar/apache-spark/1.3.0/libexec/python/pyspark/worker.py", line 115, in main report_times(outfile, boot_time, init_time, finish_time) File "/usr/local/Cellar/apache-spark/1.3.0/libexec/python/pyspark/worker.py", line 40, in report_times write_long(1000 * boot, outfile) File "/usr/local/Cellar/apache-spark/1.3.0/libexec/python/pyspark/serializers.py", line 518, in write_long stream.write(struct.pack("!q", value)) DeprecationWarning: integer argument expected, got float so I turn on the serializers.py, and tried to print the value out, which is a float, came from 1000 * time.time() when I removed my lib, or add a rdd.count() before mapping my lib, this bug won’t appear. so I edited the function to : def write_long(value, stream): stream.write(struct.pack("!q", int(value))) # added a int(value) everything seem fine… According to python’s doc for struct(https://docs.python.org/2/library/struct.html)’s Note(3), the value should be a int(for q), and if it’s a float, it’ll try use __index__(), else, try __int__, but since __int__ is deprecated, it’ll raise DeprecationWarning. And float doesn’t have __index__, but has __int__, so it should raise the exception every time. But, as you can see, in normal cases, it won’t raise the exception, and the code works perfectly, and exec struct.pack('!q', 111.1) in console or a clean file won't raise any exception…I can hardly tell how my lib might effect a time.time()'s value passed to struct.pack()... it might a python's original bug or what. Anyway, this value should be a int, so add a int() to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org