[jira] [Updated] (SPARK-17538) sqlContext.registerDataFrameAsTable is not working sometimes in pyspark 2.0.0

Srinivas Rishindra Pothireddi (JIRA) Thu, 15 Sep 2016 04:39:07 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Srinivas Rishindra Pothireddi updated SPARK-17538:
--------------------------------------------------
    Description: 
I have a production job in spark 1.6.2 that registers several dataframes as 
tables. 
After testing the job in spark 2.0.0, I found that one of the dataframes is not 
getting registered as a table.


Line 353 of my code --> self.sqlContext.registerDataFrameAsTable(anonymousDF, 
"anonymousTable")
line 354 of my code --> df = self.sqlContext.sql("select AnonymousFiled1, 
AnonymousUDF( AnonymousFiled1 ) as AnonymousFiled3 from anonymousTable")

my stacktrace

 File "anonymousFile.py", line 354, in anonymousMethod
    df = self.sqlContext.sql("select AnonymousFiled1, AnonymousUDF( 
AnonymousFiled1 ) as AnonymousFiled3 from anonymousTable")
  File 
"/home/anonymousUser/Downloads/spark-2.0.0-bin-hadoop2.7/python/pyspark/sql/context.py",
 line 350, in sql
    return self.sparkSession.sql(sqlQuery)
  File 
"/home/anonymousUser/Downloads/spark-2.0.0-bin-hadoop2.7/python/pyspark/sql/session.py",
 line 541, in sql
    return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
  File 
"/home/anonymousUser/Downloads/spark-2.0.0-bin-hadoop2.7/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py",
 line 933, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File 
"/home/anonymousUser/Downloads/spark-2.0.0-bin-hadoop2.7/python/pyspark/sql/utils.py",
 line 69, in deco
    raise AnalysisException(s.split(': ', 1)[1], stackTrace)
AnalysisException: u'Table or view not found: anonymousTable; line 1 pos 61'


The same code is working perfectly fine in spark-1.6.2 

 

  was:
I have a production job in spark 1.6.2 that registers four dataframes as 
tables. After testing the job in spark 2.0.0 one of the dataframes is not 
getting registered as a table.

output of sqlContext.tableNames() just after registering the fourth dataframe 
in spark 1.6.2 is

temp1,temp2,temp3,temp4

output of sqlContext.tableNames() just after registering the fourth dataframe 
in spark 2.0.0 is
temp1,temp2,temp3

so when the table 'temp4' is used by the job at a later stage an 
AnalysisException is raised in spark 2.0.0

There are no changes in the code whatsoever. 


 

 


> sqlContext.registerDataFrameAsTable is not working sometimes in pyspark 2.0.0
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-17538
>                 URL: https://issues.apache.org/jira/browse/SPARK-17538
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>         Environment: os - linux
> cluster -> yarn and local
>            Reporter: Srinivas Rishindra Pothireddi
>
> I have a production job in spark 1.6.2 that registers several dataframes as 
> tables. 
> After testing the job in spark 2.0.0, I found that one of the dataframes is 
> not getting registered as a table.
> Line 353 of my code --> self.sqlContext.registerDataFrameAsTable(anonymousDF, 
> "anonymousTable")
> line 354 of my code --> df = self.sqlContext.sql("select AnonymousFiled1, 
> AnonymousUDF( AnonymousFiled1 ) as AnonymousFiled3 from anonymousTable")
> my stacktrace
>  File "anonymousFile.py", line 354, in anonymousMethod
>     df = self.sqlContext.sql("select AnonymousFiled1, AnonymousUDF( 
> AnonymousFiled1 ) as AnonymousFiled3 from anonymousTable")
>   File 
> "/home/anonymousUser/Downloads/spark-2.0.0-bin-hadoop2.7/python/pyspark/sql/context.py",
>  line 350, in sql
>     return self.sparkSession.sql(sqlQuery)
>   File 
> "/home/anonymousUser/Downloads/spark-2.0.0-bin-hadoop2.7/python/pyspark/sql/session.py",
>  line 541, in sql
>     return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
>   File 
> "/home/anonymousUser/Downloads/spark-2.0.0-bin-hadoop2.7/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py",
>  line 933, in __call__
>     answer, self.gateway_client, self.target_id, self.name)
>   File 
> "/home/anonymousUser/Downloads/spark-2.0.0-bin-hadoop2.7/python/pyspark/sql/utils.py",
>  line 69, in deco
>     raise AnalysisException(s.split(': ', 1)[1], stackTrace)
> AnalysisException: u'Table or view not found: anonymousTable; line 1 pos 61'
> The same code is working perfectly fine in spark-1.6.2 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-17538) sqlContext.registerDataFrameAsTable is not working sometimes in pyspark 2.0.0

Reply via email to