[ https://issues.apache.org/jira/browse/SPARK-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Davies Liu resolved SPARK-6931. ------------------------------- Resolution: Fixed Fix Version/s: 1.2.3 1.3.2 Issue resolved by pull request 8594 [https://github.com/apache/spark/pull/8594] > python: struct.pack('!q', value) in write_long(value, stream) in > serializers.py require int(but doesn't raise exceptions in common cases) > ----------------------------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-6931 > URL: https://issues.apache.org/jira/browse/SPARK-6931 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 1.3.0 > Reporter: Chunxi Zhang > Priority: Critical > Labels: easyfix > Fix For: 1.3.2, 1.2.3 > > Original Estimate: 1h > Remaining Estimate: 1h > > when I map my own feature calculation module's function, sparks raises: > Traceback (most recent call last): > File > "/usr/local/Cellar/apache-spark/1.3.0/libexec/python/pyspark/daemon.py", line > 162, in manager > code = worker(sock) > File > "/usr/local/Cellar/apache-spark/1.3.0/libexec/python/pyspark/daemon.py", line > 60, in worker > worker_main(infile, outfile) > File > "/usr/local/Cellar/apache-spark/1.3.0/libexec/python/pyspark/worker.py", line > 115, in main > report_times(outfile, boot_time, init_time, finish_time) > File > "/usr/local/Cellar/apache-spark/1.3.0/libexec/python/pyspark/worker.py", line > 40, in report_times > write_long(1000 * boot, outfile) > File > "/usr/local/Cellar/apache-spark/1.3.0/libexec/python/pyspark/serializers.py", > line 518, in write_long > stream.write(struct.pack("!q", value)) > DeprecationWarning: integer argument expected, got float > so I turn on the serializers.py, and tried to print the value out, which is a > float, came from 1000 * time.time() > when I removed my lib, or add a rdd.count() before mapping my lib, this bug > won’t appear. > so I edited the function to : > def write_long(value, stream): > stream.write(struct.pack("!q", int(value))) # added a int(value) > everything seem fine… > According to python’s doc for > struct(https://docs.python.org/2/library/struct.html)’s Note(3), the value > should be a int(for q), and if it’s a float, it’ll try use __index__(), else, > try __int__, but since __int__ is deprecated, it’ll raise DeprecationWarning. > And float doesn’t have __index__, but has __int__, so it should raise the > exception every time. > But, as you can see, in normal cases, it won’t raise the exception, and the > code works perfectly, and exec struct.pack('!q', 111.1) in console or a clean > file won't raise any exception…I can hardly tell how my lib might effect a > time.time()'s value passed to struct.pack()... it might a python's original > bug or what. > Anyway, this value should be a int, so add a int() to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org