Spark SQL "partition stride"?

Keith Freeman Mon, 11 Jan 2016 08:32:11 -0800

The spark docs section for "JDBC to Other Databases"(https://spark.apache.org/docs/latest/sql-programming-guide.html#jdbc-to-other-databases)describes the partitioning as "... Notice that lowerBound and upperBoundare just used to decide the partition stride, not for filtering the rowsin table."

What is meant by "partition stride" here, I'm not familiar with thephrase and googling didn't help.

Also, is the behavior of this partitioning described in detailsomewhere? Looking at my SQL query log I've figured out what it's doingin my example:


say X = (upperBound - lowerBound) / numPartitions):

  query * where partitionColumn < lowerBound

query * where partitionColumn >= lowerBound and partitionColumn <lowerBound + Xquery * where parititionColumn >= lowerBound+X and partitionColumn <lowerBound+2X

  .... until the query gets to upperBound

But it would be nice to know if there's docs on this?


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Spark SQL "partition stride"?

Reply via email to