[jira] [Assigned] (SPARK-9785) HashPartitioning compatibility should consider expression ordering

2015-08-10 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-9785:
---

Assignee: Apache Spark  (was: Josh Rosen)

 HashPartitioning compatibility should consider expression ordering
 --

 Key: SPARK-9785
 URL: https://issues.apache.org/jira/browse/SPARK-9785
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.5.0
Reporter: Josh Rosen
Assignee: Apache Spark
Priority: Blocker

 HashPartitioning compatibility is defined w.r.t the _set_ of expressions, but 
 in other contexts the ordering of those expressions matters.  This is 
 illustrated by the following regression test:
 {code}
   test(HashPartitioning compatibility) {
 val expressions = Seq(Literal(2), Literal(3))
 // Consider two HashPartitionings that have the same _set_ of hash 
 expressions but which are
 // created with different orderings of those expressions:
 val partitioningA = HashPartitioning(expressions, 100)
 val partitioningB = HashPartitioning(expressions.reverse, 100)
 // These partitionings are not considered equal:
 assert(partitioningA != partitioningB)
 // However, they both satisfy the same clustered distribution:
 val distribution = ClusteredDistribution(expressions)
 assert(partitioningA.satisfies(distribution))
 assert(partitioningB.satisfies(distribution))
 // Both partitionings are compatible with and guarantee each other:
 assert(partitioningA.compatibleWith(partitioningB))
 assert(partitioningB.compatibleWith(partitioningA))
 assert(partitioningA.guarantees(partitioningB))
 assert(partitioningB.guarantees(partitioningA))
 // Given all of this, we would expect these partitionings to compute the 
 same hashcode for
 // any given row:
 def computeHashCode(partitioning: HashPartitioning): Int = {
   val hashExprProj = new 
 InterpretedMutableProjection(partitioning.expressions, Seq.empty)
   hashExprProj.apply(InternalRow.empty).hashCode()
 }
 assert(computeHashCode(partitioningA) === computeHashCode(partitioningB))
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-9785) HashPartitioning compatibility should consider expression ordering

2015-08-10 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-9785:
---

Assignee: Josh Rosen  (was: Apache Spark)

 HashPartitioning compatibility should consider expression ordering
 --

 Key: SPARK-9785
 URL: https://issues.apache.org/jira/browse/SPARK-9785
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.5.0
Reporter: Josh Rosen
Assignee: Josh Rosen
Priority: Blocker

 HashPartitioning compatibility is defined w.r.t the _set_ of expressions, but 
 in other contexts the ordering of those expressions matters.  This is 
 illustrated by the following regression test:
 {code}
   test(HashPartitioning compatibility) {
 val expressions = Seq(Literal(2), Literal(3))
 // Consider two HashPartitionings that have the same _set_ of hash 
 expressions but which are
 // created with different orderings of those expressions:
 val partitioningA = HashPartitioning(expressions, 100)
 val partitioningB = HashPartitioning(expressions.reverse, 100)
 // These partitionings are not considered equal:
 assert(partitioningA != partitioningB)
 // However, they both satisfy the same clustered distribution:
 val distribution = ClusteredDistribution(expressions)
 assert(partitioningA.satisfies(distribution))
 assert(partitioningB.satisfies(distribution))
 // Both partitionings are compatible with and guarantee each other:
 assert(partitioningA.compatibleWith(partitioningB))
 assert(partitioningB.compatibleWith(partitioningA))
 assert(partitioningA.guarantees(partitioningB))
 assert(partitioningB.guarantees(partitioningA))
 // Given all of this, we would expect these partitionings to compute the 
 same hashcode for
 // any given row:
 def computeHashCode(partitioning: HashPartitioning): Int = {
   val hashExprProj = new 
 InterpretedMutableProjection(partitioning.expressions, Seq.empty)
   hashExprProj.apply(InternalRow.empty).hashCode()
 }
 assert(computeHashCode(partitioningA) === computeHashCode(partitioningB))
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org