[jira] [Created] (LIVY-505) sparkR.session failed with "invalid jobj 1" error in Spark 2.3
shanyu zhao created LIVY-505: Summary: sparkR.session failed with "invalid jobj 1" error in Spark 2.3 Key: LIVY-505 URL: https://issues.apache.org/jira/browse/LIVY-505 Project: Livy Issue Type: Bug Components: Interpreter Affects Versions: 0.5.0, 0.5.1 Reporter: shanyu zhao In Spark 2.3 cluster, use Zeppelin with livy2 interpreter, and type: {code:java} %sparkr sparkR.session(){code} You will see error: [1] "Error in writeJobj(con, object): invalid jobj 1" In a successful case with older livy and spark versions, we see something like this: Java ref type org.apache.spark.sql.SparkSession id 1 This indicates isValidJobj() function in Spark code returned false for SparkSession obj. This is isValidJobj() function in Spark 2.3 code FYI: {code:java} isValidJobj <- function(jobj) { if (exists(".scStartTime", envir = .sparkREnv)) { jobj$appId == get(".scStartTime", envir = .sparkREnv) } else { FALSE } }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (LIVY-504) Livy pyspark sqlContext behavior does not match pyspark shell
[ https://issues.apache.org/jira/browse/LIVY-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Bronte updated LIVY-504: - Summary: Livy pyspark sqlContext behavior does not match pyspark shell (was: Pyspark sqlContext behavior does not match pyspark shell) > Livy pyspark sqlContext behavior does not match pyspark shell > - > > Key: LIVY-504 > URL: https://issues.apache.org/jira/browse/LIVY-504 > Project: Livy > Issue Type: Bug > Components: Core >Affects Versions: 0.5.0 > Environment: AWS EMR 5.16.0 >Reporter: Adam Bronte >Priority: Major > > On 0.5.0 I'm seeing inconsistent behavior through Livy regarding the spark > context and sqlContext compared to the pyspark shell. > For example running this through the pyspark shell works: > {code:java} > [root@ip-10-0-0-32 ~]# pyspark > Python 2.7.14 (default, May 2 2018, 18:31:34) > [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > 18/08/28 18:50:37 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive > is set, falling back to uploading libraries under SPARK_HOME. > Welcome to > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ > /__ / .__/\_,_/_/ /_/\_\ version 2.3.1 > /_/ > Using Python version 2.7.14 (default, May 2 2018 18:31:34) > SparkSession available as 'spark'. > >>> from pyspark.sql import SQLContext > >>> my_sql_context = SQLContext.getOrCreate(sc) > >>> df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') > >>> print(df.count()) > 67556724 > {code} > But through Livy, the same code throws an exception > {code:java} > from pyspark.sql import SQLContext > my_sql_context = SQLContext.getOrCreate(sc) > df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') > 'JavaMember' object has no attribute 'read' > Traceback (most recent call last): > File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line > 433, in read > return DataFrameReader(self) > File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", > line 70, in __init__ > self._jreader = spark._ssql_ctx.read() > AttributeError: 'JavaMember' object has no attribute 'read'{code} > Also trying to use the default initialized sqlContext throws the same error > {code:java} > df = sqlContext.read.parquet('s3://my-bucket/mydata.parquet') > 'JavaMember' object has no attribute 'read' > Traceback (most recent call last): > File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line > 433, in read > return DataFrameReader(self) > File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", > line 70, in __init__ > self._jreader = spark._ssql_ctx.read() > AttributeError: 'JavaMember' object has no attribute 'read'{code} > In both the spark shell and the livy versions, the objects look the same. > pyspark shell: > {code:java} > >>> print(sc) > > >>> print(sqlContext) > > >>> print(my_sql_context) > {code} > livy: > {code:java} > print(sc) > > print(sqlContext) > > print(my_sql_context) > {code} > I'm running this through sparkmagic but also have confirmed this is the same > behavior when calling the api directly. > {code:java} > curl --silent -X POST --data '{"kind": "pyspark"}' -H "Content-Type: > application/json" localhost:8998/sessions | python -m json.tool > { > "appId": null, > "appInfo": { > "driverLogUrl": null, > "sparkUiUrl": null > }, > "id": 3, > "kind": "pyspark", > "log": [ > "stdout: ", > "\nstderr: ", > "\nYARN Diagnostics: " > ], > "owner": null, > "proxyUser": null, > "state": "starting" > } > {code} > {code:java} > curl --silent localhost:8998/sessions/3/statements -X POST -H 'Content-Type: > application/json' -d '{"code":"df = > sqlContext.read.parquet(\"s3://my-bucket/mydata.parquet\")"}' | python -m > json.tool > { > "code": "df = sqlContext.read.parquet(\"s3://my-bucket/mydata.parquet\")", > "id": 1, > "output": null, > "progress": 0.0, > "state": "running" > } > {code} > When running on 0.4.0 both pyspark shell and livy versions worked. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (LIVY-504) Pyspark sqlContext behavior does not match pyspark shell
[ https://issues.apache.org/jira/browse/LIVY-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Bronte updated LIVY-504: - Summary: Pyspark sqlContext behavior does not match pyspark shell (was: Pyspark sqlContext behavior does not my spark shell) > Pyspark sqlContext behavior does not match pyspark shell > > > Key: LIVY-504 > URL: https://issues.apache.org/jira/browse/LIVY-504 > Project: Livy > Issue Type: Bug > Components: Core >Affects Versions: 0.5.0 > Environment: AWS EMR 5.16.0 >Reporter: Adam Bronte >Priority: Major > > On 0.5.0 I'm seeing inconsistent behavior through Livy regarding the spark > context and sqlContext compared to the pyspark shell. > For example running this through the pyspark shell works: > {code:java} > [root@ip-10-0-0-32 ~]# pyspark > Python 2.7.14 (default, May 2 2018, 18:31:34) > [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > 18/08/28 18:50:37 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive > is set, falling back to uploading libraries under SPARK_HOME. > Welcome to > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ > /__ / .__/\_,_/_/ /_/\_\ version 2.3.1 > /_/ > Using Python version 2.7.14 (default, May 2 2018 18:31:34) > SparkSession available as 'spark'. > >>> from pyspark.sql import SQLContext > >>> my_sql_context = SQLContext.getOrCreate(sc) > >>> df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') > >>> print(df.count()) > 67556724 > {code} > But through Livy, the same code throws an exception > {code:java} > from pyspark.sql import SQLContext > my_sql_context = SQLContext.getOrCreate(sc) > df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') > 'JavaMember' object has no attribute 'read' > Traceback (most recent call last): > File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line > 433, in read > return DataFrameReader(self) > File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", > line 70, in __init__ > self._jreader = spark._ssql_ctx.read() > AttributeError: 'JavaMember' object has no attribute 'read'{code} > Also trying to use the default initialized sqlContext throws the same error > {code:java} > df = sqlContext.read.parquet('s3://my-bucket/mydata.parquet') > 'JavaMember' object has no attribute 'read' > Traceback (most recent call last): > File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line > 433, in read > return DataFrameReader(self) > File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", > line 70, in __init__ > self._jreader = spark._ssql_ctx.read() > AttributeError: 'JavaMember' object has no attribute 'read'{code} > In both the spark shell and the livy versions, the objects look the same. > pyspark shell: > {code:java} > >>> print(sc) > > >>> print(sqlContext) > > >>> print(my_sql_context) > {code} > livy: > {code:java} > print(sc) > > print(sqlContext) > > print(my_sql_context) > {code} > I'm running this through sparkmagic but also have confirmed this is the same > behavior when calling the api directly. > {code:java} > curl --silent -X POST --data '{"kind": "pyspark"}' -H "Content-Type: > application/json" localhost:8998/sessions | python -m json.tool > { > "appId": null, > "appInfo": { > "driverLogUrl": null, > "sparkUiUrl": null > }, > "id": 3, > "kind": "pyspark", > "log": [ > "stdout: ", > "\nstderr: ", > "\nYARN Diagnostics: " > ], > "owner": null, > "proxyUser": null, > "state": "starting" > } > {code} > {code:java} > curl --silent localhost:8998/sessions/3/statements -X POST -H 'Content-Type: > application/json' -d '{"code":"df = > sqlContext.read.parquet(\"s3://my-bucket/mydata.parquet\")"}' | python -m > json.tool > { > "code": "df = sqlContext.read.parquet(\"s3://my-bucket/mydata.parquet\")", > "id": 1, > "output": null, > "progress": 0.0, > "state": "running" > } > {code} > When running on 0.4.0 both pyspark shell and livy versions worked. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (LIVY-504) Pyspark sqlContext behavior does not my spark shell
[ https://issues.apache.org/jira/browse/LIVY-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Bronte updated LIVY-504: - Description: On 0.5.0 I'm seeing inconsistent behavior through Livy regarding the spark context and sqlContext compared to the pyspark shell. For example running this through the pyspark shell works: {code:java} [root@ip-10-0-0-32 ~]# pyspark Python 2.7.14 (default, May 2 2018, 18:31:34) [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 Type "help", "copyright", "credits" or "license" for more information. 18/08/28 18:50:37 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.3.1 /_/ Using Python version 2.7.14 (default, May 2 2018 18:31:34) SparkSession available as 'spark'. >>> from pyspark.sql import SQLContext >>> my_sql_context = SQLContext.getOrCreate(sc) >>> df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') >>> print(df.count()) 67556724 {code} But through Livy, the same code throws an exception {code:java} from pyspark.sql import SQLContext my_sql_context = SQLContext.getOrCreate(sc) df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') 'JavaMember' object has no attribute 'read' Traceback (most recent call last): File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 433, in read return DataFrameReader(self) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 70, in __init__ self._jreader = spark._ssql_ctx.read() AttributeError: 'JavaMember' object has no attribute 'read'{code} Also trying to use the default initialized sqlContext throws the same error {code:java} df = sqlContext.read.parquet('s3://my-bucket/mydata.parquet') 'JavaMember' object has no attribute 'read' Traceback (most recent call last): File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 433, in read return DataFrameReader(self) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 70, in __init__ self._jreader = spark._ssql_ctx.read() AttributeError: 'JavaMember' object has no attribute 'read'{code} In both the spark shell and the livy versions, the objects look the same. pyspark shell: {code:java} >>> print(sc) >>> print(sqlContext) >>> print(my_sql_context) {code} livy: {code:java} print(sc) print(sqlContext) print(my_sql_context) {code} I'm running this through sparkmagic but also have confirmed this is the same behavior when calling the api directly. {code:java} curl --silent -X POST --data '{"kind": "pyspark"}' -H "Content-Type: application/json" localhost:8998/sessions | python -m json.tool { "appId": null, "appInfo": { "driverLogUrl": null, "sparkUiUrl": null }, "id": 3, "kind": "pyspark", "log": [ "stdout: ", "\nstderr: ", "\nYARN Diagnostics: " ], "owner": null, "proxyUser": null, "state": "starting" } {code} {code:java} curl --silent localhost:8998/sessions/3/statements -X POST -H 'Content-Type: application/json' -d '{"code":"df = sqlContext.read.parquet(\"s3://my-bucket/mydata.parquet\")"}' | python -m json.tool { "code": "df = sqlContext.read.parquet(\"s3://my-bucket/mydata.parquet\")", "id": 1, "output": null, "progress": 0.0, "state": "running" } {code} When running on 0.4.0 both pyspark shell and livy versions worked. was: On 0.5.0 I'm seeing inconsistent behavior through Livy regarding the spark context and sqlContext compared to the pyspark shell. For example running this through the pyspark shell works: {code:java} [root@ip-10-0-0-32 ~]# pyspark Python 2.7.14 (default, May 2 2018, 18:31:34) [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 Type "help", "copyright", "credits" or "license" for more information. 18/08/28 18:50:37 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.3.1 /_/ Using Python version 2.7.14 (default, May 2 2018 18:31:34) SparkSession available as 'spark'. >>> from pyspark.sql import SQLContext >>> my_sql_context = SQLContext.getOrCreate(sc) >>> df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') >>> print(df.count()) 67556724 {code} But through Livy, the same code throws an exception {code:java} from pyspark.sql import SQLContext my_sql_context = SQLContext.getOrCreate(sc) df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') 'JavaMember' object has no attribute 'read' Traceback (most recent call last): File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 433, in read return DataFrameReader(self) File "/usr/lib/spark/python/lib/pyspark.zip/pyspar
[jira] [Updated] (LIVY-504) Pyspark sqlContext behavior does not my spark shell
[ https://issues.apache.org/jira/browse/LIVY-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Bronte updated LIVY-504: - Description: On 0.5.0 I'm seeing inconsistent behavior through Livy regarding the spark context and sqlContext compared to the pyspark shell. For example running this through the pyspark shell works: {code:java} [root@ip-10-0-0-32 ~]# pyspark Python 2.7.14 (default, May 2 2018, 18:31:34) [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 Type "help", "copyright", "credits" or "license" for more information. 18/08/28 18:50:37 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.3.1 /_/ Using Python version 2.7.14 (default, May 2 2018 18:31:34) SparkSession available as 'spark'. >>> from pyspark.sql import SQLContext >>> my_sql_context = SQLContext.getOrCreate(sc) >>> df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') >>> print(df.count()) 67556724 {code} But through Livy, the same code throws an exception {code:java} from pyspark.sql import SQLContext my_sql_context = SQLContext.getOrCreate(sc) df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') 'JavaMember' object has no attribute 'read' Traceback (most recent call last): File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 433, in read return DataFrameReader(self) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 70, in __init__ self._jreader = spark._ssql_ctx.read() AttributeError: 'JavaMember' object has no attribute 'read'{code} Also trying to use the default initialized sqlContext throws the same error {code:java} df = sqlContext.read.parquet('s3://my-bucket/mydata.parquet') 'JavaMember' object has no attribute 'read' Traceback (most recent call last): File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 433, in read return DataFrameReader(self) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 70, in __init__ self._jreader = spark._ssql_ctx.read() AttributeError: 'JavaMember' object has no attribute 'read'{code} In both the spark shell and the livy versions, the objects look the same. pyspark shell: {code:java} >>> print(sc) >>> print(sqlContext) >>> print(my_sql_context) {code} livy: {code:java} print(sc) print(sqlContext) print(my_sql_context) {code} I'm running this through sparkmagic but also have confirmed this is the same behavior when calling the api directly. {code:java} curl --silent -X POST --data '{"kind": "pyspark"}' -H "Content-Type: application/json" localhost:8998/sessions | python -m json.tool { "appId": null, "appInfo": { "driverLogUrl": null, "sparkUiUrl": null }, "id": 3, "kind": "pyspark", "log": [ "stdout: ", "\nstderr: ", "\nYARN Diagnostics: " ], "owner": null, "proxyUser": null, "state": "starting" } {code} {code:java} curl --silent localhost:8998/sessions/3/statements -X POST -H 'Content-Type: application/json' -d '{"code":"df = sqlContext.read.parquet(\"s3://my-bucket/mydata.parquet\")"}' | python -m json.tool { "code": "df = sqlContext.read.parquet(\"s3://my-bucket/mydata.parquet\")", "id": 1, "output": null, "progress": 0.0, "state": "running" } {code} was: On 0.5.0 I'm seeing inconsistent behavior through Livy regarding the spark context and sqlContext compared to the pyspark shell. For example running this through the pyspark shell works: {code:java} [root@ip-10-0-0-32 ~]# pyspark Python 2.7.14 (default, May 2 2018, 18:31:34) [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 Type "help", "copyright", "credits" or "license" for more information. 18/08/28 18:50:37 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.3.1 /_/ Using Python version 2.7.14 (default, May 2 2018 18:31:34) SparkSession available as 'spark'. >>> from pyspark.sql import SQLContext >>> my_sql_context = SQLContext.getOrCreate(sc) >>> df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') >>> print(df.count()) 67556724 {code} But through Livy, the same code throws an exception {code:java} from pyspark.sql import SQLContext my_sql_context = SQLContext.getOrCreate(sc) df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') 'JavaMember' object has no attribute 'read' Traceback (most recent call last): File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 433, in read return DataFrameReader(self) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 70, in __init__ self._jreader = spar
[jira] [Created] (LIVY-504) Pyspark sqlContext behavior does not my spark shell
Adam Bronte created LIVY-504: Summary: Pyspark sqlContext behavior does not my spark shell Key: LIVY-504 URL: https://issues.apache.org/jira/browse/LIVY-504 Project: Livy Issue Type: Bug Components: Core Affects Versions: 0.5.0 Environment: AWS EMR 5.16.0 Reporter: Adam Bronte On 0.5.0 I'm seeing inconsistent behavior through Livy regarding the spark context and sqlContext compared to the pyspark shell. For example running this through the pyspark shell works: {code:java} [root@ip-10-0-0-32 ~]# pyspark Python 2.7.14 (default, May 2 2018, 18:31:34) [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 Type "help", "copyright", "credits" or "license" for more information. 18/08/28 18:50:37 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.3.1 /_/ Using Python version 2.7.14 (default, May 2 2018 18:31:34) SparkSession available as 'spark'. >>> from pyspark.sql import SQLContext >>> my_sql_context = SQLContext.getOrCreate(sc) >>> df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') >>> print(df.count()) 67556724 {code} But through Livy, the same code throws an exception {code:java} from pyspark.sql import SQLContext my_sql_context = SQLContext.getOrCreate(sc) df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') 'JavaMember' object has no attribute 'read' Traceback (most recent call last): File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 433, in read return DataFrameReader(self) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 70, in __init__ self._jreader = spark._ssql_ctx.read() AttributeError: 'JavaMember' object has no attribute 'read'{code} Also trying to use the default initialized sqlContext throws the same error {code:java} df = my_sql_context.read.parquet('s3://my-bucket/mydata.parquet') 'JavaMember' object has no attribute 'read' Traceback (most recent call last): File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 433, in read return DataFrameReader(self) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 70, in __init__ self._jreader = spark._ssql_ctx.read() AttributeError: 'JavaMember' object has no attribute 'read'{code} In both the spark shell and the livy versions, the objects look the same. pyspark shell: {code:java} >>> print(sc) >>> print(sqlContext) >>> print(my_sql_context) {code} livy: {code:java} print(sc) print(sqlContext) print(my_sql_context) {code} I'm running this through sparkmagic but also have confirmed this is the same behavior when calling the api directly. {code:java} curl --silent -X POST --data '{"kind": "pyspark"}' -H "Content-Type: application/json" localhost:8998/sessions | python -m json.tool { "appId": null, "appInfo": { "driverLogUrl": null, "sparkUiUrl": null }, "id": 3, "kind": "pyspark", "log": [ "stdout: ", "\nstderr: ", "\nYARN Diagnostics: " ], "owner": null, "proxyUser": null, "state": "starting" } {code} {code:java} curl --silent localhost:8998/sessions/3/statements -X POST -H 'Content-Type: application/json' -d '{"code":"df = sqlContext.read.parquet(\"s3://my-bucket/mydata.parquet\")"}' | python -m json.tool { "code": "df = sqlContext.read.parquet(\"s3://my-bucket/mydata.parquet\")", "id": 1, "output": null, "progress": 0.0, "state": "running" } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (LIVY-503) More RPC classes used in thrifserver in a separate module
Marco Gaido created LIVY-503: Summary: More RPC classes used in thrifserver in a separate module Key: LIVY-503 URL: https://issues.apache.org/jira/browse/LIVY-503 Project: Livy Issue Type: Sub-task Reporter: Marco Gaido As suggested in the discussion for the original PR (https://github.com/apache/incubator-livy/pull/104#discussion_r212806490), we should move the RPC classes which need to be uploaded to the Spark session in a separate module, in order to upload as few classes as possible and avoid eventual interaction with the Spark session created. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (LIVY-502) Cleanup Hive dependencies
Marco Gaido created LIVY-502: Summary: Cleanup Hive dependencies Key: LIVY-502 URL: https://issues.apache.org/jira/browse/LIVY-502 Project: Livy Issue Type: Sub-task Reporter: Marco Gaido In the starting implementation we are relying/delegating some of the work to the Hive classes used in the HiveServer2. This helped simplifying the creation of the first implementation, as it saved to write a lot of code. But this caused also a dependency on the {{hive-exec}} package, as well as compelled us to modify a bit some of the existing Hive classes. The JIRA tracks removing these workarounds by re-implementing the same logic in Livy to get rid of all Hive dependencies, other than the rpc and service layers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)