[jira] [Updated] (LIVY-995) JsonParseException is thrown when closing Livy session when using python profile
[ https://issues.apache.org/jira/browse/LIVY-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianzhen Wu updated LIVY-995: - Description: Startup and enable spark.python.profile. {code:java} ./bin/pyspark --master local --conf spark.python.profile=true {code} Execute code related to Spark RDD. When pyspark is closed, Pyspark will output profile information. {code:java} >>> rdd = sc.parallelize(range(100)).map(str) >>> rdd.count() [Stage 0:> (0 + 1) / 1] 100 >>> Profile of RDD 244 function calls (241 primitive calls) in 0.001 seconds Ordered by: internal time, cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 101 0.000 0.000 0.000 0.000 rdd.py:1237() 101 0.000 0.000 0.000 0.000 util.py:72(wrapper) 1 0.000 0.000 0.000 0.000 serializers.py:255(dump_stream) 1 0.000 0.000 0.000 0.000 serializers.py:213(load_stream) 2 0.000 0.000 0.000 0.000 \{built-in method builtins.sum} 1 0.000 0.000 0.001 0.001 worker.py:607(process) 1 0.000 0.000 0.000 0.000 context.py:549(f) 1 0.000 0.000 0.000 0.000 \{built-in method _pickle.dumps} 1 0.000 0.000 0.000 0.000 serializers.py:561(read_int) 1 0.000 0.000 0.000 0.000 serializers.py:568(write_int) 4/1 0.000 0.000 0.000 0.000 rdd.py:2917(pipeline_func) 1 0.000 0.000 0.000 0.000 serializers.py:426(dumps) 1 0.000 0.000 0.000 0.000 rdd.py:1237() 1 0.000 0.000 0.000 0.000 serializers.py:135(load_stream) 2 0.000 0.000 0.000 0.000 rdd.py:1072(func) 1 0.000 0.000 0.000 0.000 rdd.py:384(func) 1 0.000 0.000 0.000 0.000 util.py:67(fail_on_stopiteration) 1 0.000 0.000 0.000 0.000 serializers.py:151(_read_with_length) 2 0.000 0.000 0.000 0.000 context.py:546(getStart) 3 0.000 0.000 0.000 0.000 rdd.py:416(func) 1 0.000 0.000 0.000 0.000 serializers.py:216(_load_stream_without_unbatching) 2 0.000 0.000 0.000 0.000 \{method 'write' of '_io.BufferedWriter' objects} 1 0.000 0.000 0.000 0.000 \{method 'read' of '_io.BufferedReader' objects} 1 0.000 0.000 0.000 0.000 \{built-in method _operator.add} 1 0.000 0.000 0.000 0.000 \{built-in method builtins.hasattr} 3 0.000 0.000 0.000 0.000 \{built-in method builtins.len} 1 0.000 0.000 0.000 0.000 \{built-in method _struct.unpack} 1 0.000 0.000 0.000 0.000 rdd.py:1226() 1 0.000 0.000 0.000 0.000 \{method 'close' of 'generator' objects} 1 0.000 0.000 0.000 0.000 \{built-in method from_iterable} 1 0.000 0.000 0.000 0.000 \{built-in method _struct.pack} 1 0.000 0.000 0.000 0.000 \{method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 \{built-in method builtins.iter} {code} This is because Spark register show_profiles when Spark exit in profile.py {code:java} def add_profiler(self, id, profiler): """Add a profiler for RDD/UDF `id`""" if not self.profilers: if self.profile_dump_path: atexit.register(self.dump_profiles, self.profile_dump_path) else: atexit.register(self.show_profiles) self.profilers.append([id, profiler, False]) {code} For Livy session, Livy does not convert the output to JSON format. And throw below exception: {code:java} com.fasterxml.jackson.core.JsonParseException: Unexpected character ('=' (code 61)): expected a valid value (JSON String, Number, Array, Object or token 'null', 'true' or 'false') at [Source: (String)""; line: 1, column: 2] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:2337) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:710) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:635) at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1952) at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:781) at com.fasterxml.jackson.databind.ObjectReader._initForReading(ObjectReader.java:355) at
[jira] [Updated] (LIVY-995) JsonParseException is thrown when closing Livy session when using python profile
[ https://issues.apache.org/jira/browse/LIVY-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianzhen Wu updated LIVY-995: - Component/s: REPL Fix Version/s: 0.9.0 > JsonParseException is thrown when closing Livy session when using python > profile > > > Key: LIVY-995 > URL: https://issues.apache.org/jira/browse/LIVY-995 > Project: Livy > Issue Type: Improvement > Components: REPL >Reporter: Jianzhen Wu >Assignee: Jianzhen Wu >Priority: Critical > Fix For: 0.9.0 > > > Startup and enable spark.python.profile. > {code:java} > ./bin/pyspark --master local --conf spark.python.profile=true > {code} > > Execute code related to Spark RDD. When pyspark is closed, Pyspark will > output profile information. > {code:java} > >>> rdd = sc.parallelize(range(100)).map(str) > >>> rdd.count() > [Stage 0:> (0 + 1) / > 1] > 100 > >>> > > Profile of RDD > > 244 function calls (241 primitive calls) in 0.001 seconds > > Ordered by: internal time, cumulative time > > ncalls tottime percall cumtime percall filename:lineno(function) > 101 0.000 0.000 0.000 0.000 rdd.py:1237() > 101 0.000 0.000 0.000 0.000 util.py:72(wrapper) > 1 0.000 0.000 0.000 0.000 serializers.py:255(dump_stream) > 1 0.000 0.000 0.000 0.000 serializers.py:213(load_stream) > 2 0.000 0.000 0.000 0.000 \{built-in method builtins.sum} > 1 0.000 0.000 0.001 0.001 worker.py:607(process) > 1 0.000 0.000 0.000 0.000 context.py:549(f) > 1 0.000 0.000 0.000 0.000 \{built-in method _pickle.dumps} > 1 0.000 0.000 0.000 0.000 serializers.py:561(read_int) > 1 0.000 0.000 0.000 0.000 serializers.py:568(write_int) > 4/1 0.000 0.000 0.000 0.000 rdd.py:2917(pipeline_func) > 1 0.000 0.000 0.000 0.000 serializers.py:426(dumps) > 1 0.000 0.000 0.000 0.000 rdd.py:1237() > 1 0.000 0.000 0.000 0.000 serializers.py:135(load_stream) > 2 0.000 0.000 0.000 0.000 rdd.py:1072(func) > 1 0.000 0.000 0.000 0.000 rdd.py:384(func) > 1 0.000 0.000 0.000 0.000 > util.py:67(fail_on_stopiteration) > 1 0.000 0.000 0.000 0.000 > serializers.py:151(_read_with_length) > 2 0.000 0.000 0.000 0.000 context.py:546(getStart) > 3 0.000 0.000 0.000 0.000 rdd.py:416(func) > 1 0.000 0.000 0.000 0.000 > serializers.py:216(_load_stream_without_unbatching) > 2 0.000 0.000 0.000 0.000 \{method 'write' of > '_io.BufferedWriter' objects} > 1 0.000 0.000 0.000 0.000 \{method 'read' of > '_io.BufferedReader' objects} > 1 0.000 0.000 0.000 0.000 \{built-in method _operator.add} > 1 0.000 0.000 0.000 0.000 \{built-in method > builtins.hasattr} > 3 0.000 0.000 0.000 0.000 \{built-in method builtins.len} > 1 0.000 0.000 0.000 0.000 \{built-in method > _struct.unpack} > 1 0.000 0.000 0.000 0.000 rdd.py:1226() > 1 0.000 0.000 0.000 0.000 \{method 'close' of 'generator' > objects} > 1 0.000 0.000 0.000 0.000 \{built-in method from_iterable} > 1 0.000 0.000 0.000 0.000 \{built-in method _struct.pack} > 1 0.000 0.000 0.000 0.000 \{method 'disable' of > '_lsprof.Profiler' objects} > 1 0.000 0.000 0.000 0.000 \{built-in method builtins.iter} > {code} > > This is because Spark register show_profiles when Spark exit in profile.py > {code:java} > def add_profiler(self, id, profiler): > """Add a profiler for RDD/UDF `id`""" > if not self.profilers: > if self.profile_dump_path: > atexit.register(self.dump_profiles, self.profile_dump_path) > else: > atexit.register(self.show_profiles) > > self.profilers.append([id, profiler, False]) > {code} > > > For Livy session, Livy does not convert the output to JSON format. And throw > below exception: > > {code:java} > 24/01/17 11:17:30 INFO [shutdown-hook-0] ApplicationMaster: Unregistering > ApplicationMaster with FAILED (diag message: User class threw exception: > com.fasterxml.jackson.core.JsonParseException: Unexpected character ('=' > (code 61)): expected a valid value (JSON
[jira] [Updated] (LIVY-995) JsonParseException is thrown when closing Livy session when using python profile
[ https://issues.apache.org/jira/browse/LIVY-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianzhen Wu updated LIVY-995: - Description: Startup and enable spark.python.profile. {code:java} ./bin/pyspark --master local --conf spark.python.profile=true {code} Execute code related to Spark RDD. When pyspark is closed, Pyspark will output profile information. {code:java} >>> rdd = sc.parallelize(range(100)).map(str) >>> rdd.count() [Stage 0:> (0 + 1) / 1] 100 >>> Profile of RDD 244 function calls (241 primitive calls) in 0.001 seconds Ordered by: internal time, cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 101 0.000 0.000 0.000 0.000 rdd.py:1237() 101 0.000 0.000 0.000 0.000 util.py:72(wrapper) 1 0.000 0.000 0.000 0.000 serializers.py:255(dump_stream) 1 0.000 0.000 0.000 0.000 serializers.py:213(load_stream) 2 0.000 0.000 0.000 0.000 \{built-in method builtins.sum} 1 0.000 0.000 0.001 0.001 worker.py:607(process) 1 0.000 0.000 0.000 0.000 context.py:549(f) 1 0.000 0.000 0.000 0.000 \{built-in method _pickle.dumps} 1 0.000 0.000 0.000 0.000 serializers.py:561(read_int) 1 0.000 0.000 0.000 0.000 serializers.py:568(write_int) 4/1 0.000 0.000 0.000 0.000 rdd.py:2917(pipeline_func) 1 0.000 0.000 0.000 0.000 serializers.py:426(dumps) 1 0.000 0.000 0.000 0.000 rdd.py:1237() 1 0.000 0.000 0.000 0.000 serializers.py:135(load_stream) 2 0.000 0.000 0.000 0.000 rdd.py:1072(func) 1 0.000 0.000 0.000 0.000 rdd.py:384(func) 1 0.000 0.000 0.000 0.000 util.py:67(fail_on_stopiteration) 1 0.000 0.000 0.000 0.000 serializers.py:151(_read_with_length) 2 0.000 0.000 0.000 0.000 context.py:546(getStart) 3 0.000 0.000 0.000 0.000 rdd.py:416(func) 1 0.000 0.000 0.000 0.000 serializers.py:216(_load_stream_without_unbatching) 2 0.000 0.000 0.000 0.000 \{method 'write' of '_io.BufferedWriter' objects} 1 0.000 0.000 0.000 0.000 \{method 'read' of '_io.BufferedReader' objects} 1 0.000 0.000 0.000 0.000 \{built-in method _operator.add} 1 0.000 0.000 0.000 0.000 \{built-in method builtins.hasattr} 3 0.000 0.000 0.000 0.000 \{built-in method builtins.len} 1 0.000 0.000 0.000 0.000 \{built-in method _struct.unpack} 1 0.000 0.000 0.000 0.000 rdd.py:1226() 1 0.000 0.000 0.000 0.000 \{method 'close' of 'generator' objects} 1 0.000 0.000 0.000 0.000 \{built-in method from_iterable} 1 0.000 0.000 0.000 0.000 \{built-in method _struct.pack} 1 0.000 0.000 0.000 0.000 \{method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 \{built-in method builtins.iter} {code} This is because Spark register show_profiles when Spark exit in profile.py {code:java} def add_profiler(self, id, profiler): """Add a profiler for RDD/UDF `id`""" if not self.profilers: if self.profile_dump_path: atexit.register(self.dump_profiles, self.profile_dump_path) else: atexit.register(self.show_profiles) self.profilers.append([id, profiler, False]) {code} For Livy session, Livy does not convert the output to JSON format. And throw below exception: {code:java} 24/01/17 11:17:30 INFO [shutdown-hook-0] ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: User class threw exception: com.fasterxml.jackson.core.JsonParseException: Unexpected character ('=' (code 61)): expected a valid value (JSON String, Number, Array, Object or token 'null', 'true' or 'false') at [Source: (String)""; line: 1, column: 2] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:2337) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:710) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:635) at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1952) at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:781) at
[jira] [Created] (LIVY-995) JsonParseException is thrown when closing Livy session when using python profile
Jianzhen Wu created LIVY-995: Summary: JsonParseException is thrown when closing Livy session when using python profile Key: LIVY-995 URL: https://issues.apache.org/jira/browse/LIVY-995 Project: Livy Issue Type: Improvement Reporter: Jianzhen Wu Assignee: Jianzhen Wu Startup and enable spark.python.profile. {code:java} ./bin/pyspark --master local --conf spark.python.profile=true {code} Execute code related to Spark RDD. When pyspark is closed, Pyspark will output profile information. {code:java} >>> rdd = sc.parallelize(range(100)).map(str) >>> rdd.count() [Stage 0:> (0 + 1) / 1] 100 >>> Profile of RDD 244 function calls (241 primitive calls) in 0.001 seconds Ordered by: internal time, cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 101 0.000 0.000 0.000 0.000 rdd.py:1237() 101 0.000 0.000 0.000 0.000 util.py:72(wrapper) 1 0.000 0.000 0.000 0.000 serializers.py:255(dump_stream) 1 0.000 0.000 0.000 0.000 serializers.py:213(load_stream) 2 0.000 0.000 0.000 0.000 \{built-in method builtins.sum} 1 0.000 0.000 0.001 0.001 worker.py:607(process) 1 0.000 0.000 0.000 0.000 context.py:549(f) 1 0.000 0.000 0.000 0.000 \{built-in method _pickle.dumps} 1 0.000 0.000 0.000 0.000 serializers.py:561(read_int) 1 0.000 0.000 0.000 0.000 serializers.py:568(write_int) 4/1 0.000 0.000 0.000 0.000 rdd.py:2917(pipeline_func) 1 0.000 0.000 0.000 0.000 serializers.py:426(dumps) 1 0.000 0.000 0.000 0.000 rdd.py:1237() 1 0.000 0.000 0.000 0.000 serializers.py:135(load_stream) 2 0.000 0.000 0.000 0.000 rdd.py:1072(func) 1 0.000 0.000 0.000 0.000 rdd.py:384(func) 1 0.000 0.000 0.000 0.000 util.py:67(fail_on_stopiteration) 1 0.000 0.000 0.000 0.000 serializers.py:151(_read_with_length) 2 0.000 0.000 0.000 0.000 context.py:546(getStart) 3 0.000 0.000 0.000 0.000 rdd.py:416(func) 1 0.000 0.000 0.000 0.000 serializers.py:216(_load_stream_without_unbatching) 2 0.000 0.000 0.000 0.000 \{method 'write' of '_io.BufferedWriter' objects} 1 0.000 0.000 0.000 0.000 \{method 'read' of '_io.BufferedReader' objects} 1 0.000 0.000 0.000 0.000 \{built-in method _operator.add} 1 0.000 0.000 0.000 0.000 \{built-in method builtins.hasattr} 3 0.000 0.000 0.000 0.000 \{built-in method builtins.len} 1 0.000 0.000 0.000 0.000 \{built-in method _struct.unpack} 1 0.000 0.000 0.000 0.000 rdd.py:1226() 1 0.000 0.000 0.000 0.000 \{method 'close' of 'generator' objects} 1 0.000 0.000 0.000 0.000 \{built-in method from_iterable} 1 0.000 0.000 0.000 0.000 \{built-in method _struct.pack} 1 0.000 0.000 0.000 0.000 \{method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 \{built-in method builtins.iter} {code} This is because Spark register show_profiles when Spark exit in profile.py {code:java} def add_profiler(self, id, profiler): """Add a profiler for RDD/UDF `id`""" if not self.profilers: if self.profile_dump_path: atexit.register(self.dump_profiles, self.profile_dump_path) else: atexit.register(self.show_profiles) self.profilers.append([id, profiler, False]) {code} For Livy session, Livy does not convert the output to JSON format. And throw below exception: {code:java} 24/01/17 11:17:30 INFO [shutdown-hook-0] ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: User class threw exception: com.fasterxml.jackson.core.JsonParseException: Unexpected character ('=' (code 61)): expected a valid value (JSON String, Number, Array, Object or token 'null', 'true' or 'false') at [Source: (String)""; line: 1, column: 2] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:2337) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:710) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:635) at