Hi Michael, I've been working on this in my repo: https://github.com/mateiz/spark/tree/decimal. I'll make some pull requests with these features soon, but meanwhile you can try this branch. See https://github.com/mateiz/spark/compare/decimal for the individual commits that went into it. It has exactly the precision stuff you need, plus some optimizations for working on decimals.
Matei On Oct 12, 2014, at 1:51 PM, Michael Allman <mich...@videoamp.com> wrote: > Hello, > > I'm interested in reading/writing parquet SchemaRDDs that support the Parquet > Decimal converted type. The first thing I did was update the Spark parquet > dependency to version 1.5.0, as this version introduced support for decimals > in parquet. However, conversion between the catalyst decimal type and the > parquet decimal type is complicated by the fact that the catalyst type does > not specify a decimal precision and scale but the parquet type requires them. > > I'm wondering if perhaps we could add an optional precision and scale to the > catalyst decimal type? The catalyst decimal type would have unspecified > precision and scale by default for backwards compatibility, but users who > want to serialize a SchemaRDD with decimal(s) to parquet would have to narrow > their decimal type(s) by specifying a precision and scale. > > Thoughts? > > Michael > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org