Bump!

On Thu, Oct 13, 2016 at 8:25 PM Efe Selcuk <efema...@gmail.com> wrote:

> I have a use case where I want to build a dataset based off of
> conditionally available data. I thought I'd do something like this:
>
> case class SomeData( ... ) // parameters are basic encodable types like
> strings and BigDecimals
>
> var data = spark.emptyDataset[SomeData]
>
> // loop, determining what data to ingest and process into datasets
>   data = data.union(someCode.thatReturnsADataset)
> // end loop
>
> However I get a runtime exception:
>
> Exception in thread "main" org.apache.spark.sql.AnalysisException:
> unresolved operator 'Union;
>         at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:40)
>         at
> org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:58)
>         at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:361)
>         at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:67)
>         at
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:126)
>         at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:67)
>         at
> org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:58)
>         at
> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
>         at org.apache.spark.sql.Dataset.<init>(Dataset.scala:161)
>         at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167)
>         at org.apache.spark.sql.Dataset$.apply(Dataset.scala:59)
>         at org.apache.spark.sql.Dataset.withTypedPlan(Dataset.scala:2594)
>         at org.apache.spark.sql.Dataset.union(Dataset.scala:1459)
>
> Granted, I'm new at Spark so this might be an anti-pattern, so I'm open to
> suggestions. However it doesn't seem like I'm doing anything incorrect
> here, the types are correct. Searching for this error online returns
> results seemingly about working in dataframes and having mismatching
> schemas or a different order of fields, and it seems like bugfixes have
> gone into place for those cases.
>
> Thanks in advance.
> Efe
>
>

Reply via email to