Re: Sort order in bucketing in a custom datasource

2019-04-16 Thread Jacek Laskowski
Hi, I don't think so. I can't think of an interface (trait) that would give that information to the Catalyst optimizer. Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski Mastering Spark SQL https://bit.ly/mastering-spark-sql Spark Structured Streaming https://bit.ly/spark-structure

Re: Sort order in bucketing in a custom datasource

2019-04-16 Thread Russell Spitzer
Please join the DataSource V2 meetings, the next one is tomorrow since we are discussing these very topics right now. Datasource v1 cannot provide this information but any source which just generates RDDs can specify a partitioner. This is only useful though if you are only using RDDs, for Datafram

Sort order in bucketing in a custom datasource

2019-04-16 Thread Long, Andrew
Hey Friends, Is it possible to specify the sort order or bucketing in a way that can be used by the optimizer in spark? Cheers Andrew