[jira] [Commented] (SPARK-13947) PySpark DataFrames: The error message from using an invalid table reference is not clear
[ https://issues.apache.org/jira/browse/SPARK-13947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15888034#comment-15888034 ] Apache Spark commented on SPARK-13947: -- User 'rberenguel' has created a pull request for this issue: https://github.com/apache/spark/pull/17100 > PySpark DataFrames: The error message from using an invalid table reference > is not clear > > > Key: SPARK-13947 > URL: https://issues.apache.org/jira/browse/SPARK-13947 > Project: Spark > Issue Type: Improvement > Components: PySpark >Reporter: Wes McKinney > > {code} > import numpy as np > import pandas as pd > df = pd.DataFrame({'foo': np.random.randn(1000), >'bar': np.random.randn(1000)}) > df2 = pd.DataFrame({'foo': np.random.randn(1000), > 'bar': np.random.randn(1000)}) > sdf = sqlContext.createDataFrame(df) > sdf2 = sqlContext.createDataFrame(df2) > sdf[sdf2.foo > 0] > {code} > Produces this error message: > {code} > AnalysisException: u'resolved attribute(s) foo#91 missing from bar#87,foo#88 > in operator !Filter (foo#91 > cast(0 as double));' > {code} > It may be possible to make it more clear what the user did wrong. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13947) PySpark DataFrames: The error message from using an invalid table reference is not clear
[ https://issues.apache.org/jira/browse/SPARK-13947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883688#comment-15883688 ] Ruben Berenguel commented on SPARK-13947: - I'll give a shot to this one as a first dive into the Spark codebase. Wish me luck :) > PySpark DataFrames: The error message from using an invalid table reference > is not clear > > > Key: SPARK-13947 > URL: https://issues.apache.org/jira/browse/SPARK-13947 > Project: Spark > Issue Type: Improvement > Components: PySpark >Reporter: Wes McKinney > > {code} > import numpy as np > import pandas as pd > df = pd.DataFrame({'foo': np.random.randn(1000), >'bar': np.random.randn(1000)}) > df2 = pd.DataFrame({'foo': np.random.randn(1000), > 'bar': np.random.randn(1000)}) > sdf = sqlContext.createDataFrame(df) > sdf2 = sqlContext.createDataFrame(df2) > sdf[sdf2.foo > 0] > {code} > Produces this error message: > {code} > AnalysisException: u'resolved attribute(s) foo#91 missing from bar#87,foo#88 > in operator !Filter (foo#91 > cast(0 as double));' > {code} > It may be possible to make it more clear what the user did wrong. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org