Davies,

that seemed to be my issue, my colleague helped me to resolved it. The
problem was that we build RDD<Row> and corresponding StructType by
ourselves (no json, parquet, cassandra, etc - we take a list of business
objects and convert them to Rows, then infer struct type) and I missed one
thing.
--
Be well!
Jean Morozov

On Tue, Oct 6, 2015 at 1:58 AM, Davies Liu <dav...@databricks.com> wrote:

> Could you tell us a way to reproduce this failure? Reading from JSON or
> Parquet?
>
> On Mon, Oct 5, 2015 at 4:28 AM, Eugene Morozov
> <evgeny.a.moro...@gmail.com> wrote:
> > Hi,
> >
> > We're building our own framework on top of spark and we give users pretty
> > complex schema to work with. That requires from us to build dataframes by
> > ourselves: we transform business objects to rows and struct types and
> uses
> > these two to create dataframe.
> >
> > Everything was fine until I started to upgrade to spark 1.5.0 (from
> 1.3.1).
> > Seems to be catalyst engine has been changed and now using almost the
> same
> > code to produce rows and struct types I have the following:
> > http://ibin.co/2HzUsoe9O96l, some of rows in the end result have
> different
> > number of values and corresponding struct types.
> >
> > I'm almost sure it's my own fault, but there is always a small chance,
> that
> > something is wrong in spark codebase. If you've seen something similar
> or if
> > there is a jira for smth similar, I'd be glad to know. Thanks.
> > --
> > Be well!
> > Jean Morozov
>

Reply via email to