[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform

2016-05-06 Thread Pete Robbins (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885
 ] 

Pete Robbins edited comment on SPARK-15154 at 5/6/16 11:20 AM:
---

[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian because the 
following 4 bytes happen to be 0, whereas on Big Endian getInt returns "1" but 
get Long will return "268435456"

{code}
val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Ints not 
Longs
map.append(key, unsafeRow)
  }
}
{code}



was (Author: robbinspg):
[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian because the 
following 4 bytes happen to be 0, whereas on Big Endian getInt returns "1" but 
get Long will return "268435456"

{code}
val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Intsnot 
Longs
map.append(key, unsafeRow)
  }
}
{code}


> LongHashedRelation fails on Big Endian platform
> ---
>
> Key: SPARK-15154
> URL: https://issues.apache.org/jira/browse/SPARK-15154
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Pete Robbins
>  Labels: big-endian
>
> NPE in 
> org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap
> Error Message
> java.lang.NullPointerException was thrown.
> Stacktrace
>   java.lang.NullPointerException
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121)
>   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
>   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57)
>   at 
> org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
>   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
>   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at 

[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform

2016-05-06 Thread Pete Robbins (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885
 ] 

Pete Robbins edited comment on SPARK-15154 at 5/6/16 11:14 AM:
---

[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian because the 
following 4 bytes happen to be 0, whereas on Big Endian getInt returns "1" but 
get Long will return "268435456"

{code}
val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Intsnot 
Longs
map.append(key, unsafeRow)
  }
}
{code}



was (Author: robbinspg):
[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian whereas on 
Big Endian getInt returns "1" but get Long will return "268435456"

{code}
val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Intsnot 
Longs
map.append(key, unsafeRow)
  }
}
{code}


> LongHashedRelation fails on Big Endian platform
> ---
>
> Key: SPARK-15154
> URL: https://issues.apache.org/jira/browse/SPARK-15154
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Pete Robbins
>  Labels: big-endian
>
> NPE in 
> org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap
> Error Message
> java.lang.NullPointerException was thrown.
> Stacktrace
>   java.lang.NullPointerException
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121)
>   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
>   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57)
>   at 
> org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
>   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
>   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at 

[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform

2016-05-06 Thread Pete Robbins (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273897#comment-15273897
 ] 

Pete Robbins edited comment on SPARK-15154 at 5/6/16 11:13 AM:
---

Is this just a testcase issue where in HashedRelationSuite


{code}
val key = Seq(BoundReference(0, IntegerType, false))
{code}

should be

{code}
val key = Seq(BoundReference(0, LongType, false))
{code}


Ans: No, still fails with that change.


was (Author: robbinspg):
Is this just a testcase issue where in HashedRelationSuite


{code}
val key = Seq(BoundReference(0, IntegerType, false))
{code}

should be

{code}
val key = Seq(BoundReference(0, LongType, false))
{code}

> LongHashedRelation fails on Big Endian platform
> ---
>
> Key: SPARK-15154
> URL: https://issues.apache.org/jira/browse/SPARK-15154
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Pete Robbins
>  Labels: big-endian
>
> NPE in 
> org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap
> Error Message
> java.lang.NullPointerException was thrown.
> Stacktrace
>   java.lang.NullPointerException
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121)
>   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
>   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57)
>   at 
> org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
>   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
>   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
>   at 
> org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
>   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
>   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
>   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
>   at org.scalatest.Suite$class.run(Suite.scala:1424)
>   at 
> org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
>   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
>   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
>   at 
> org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:29)
>   at 
> org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:257)
>   at 
> org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:256)
>   at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:29)
>   at org.scalatest.Suite$class.callExecuteOnSuite$1(Suite.scala:1492)
>   at 
> org.scalatest.Suite$$anonfun$runNestedSuites$1.apply(Suite.scala:1528)
>   at 
> org.scalatest.Suite$$anonfun$runNestedSuites$1.apply(Suite.scala:1526)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at 

[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform

2016-05-06 Thread Pete Robbins (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885
 ] 

Pete Robbins edited comment on SPARK-15154 at 5/6/16 10:28 AM:
---

[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian whereas on 
Big Endian getInt returns "1" but get Long will return "268435456"

{code}
val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Intsnot 
Longs
map.append(key, unsafeRow)
  }
}
{code}



was (Author: robbinspg):
[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian whereas on 
Big Endian getInt returns "1" but get Long will return "268435456"

{quote}
val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long

map.append(key, unsafeRow)
  }
}
{quote}


> LongHashedRelation fails on Big Endian platform
> ---
>
> Key: SPARK-15154
> URL: https://issues.apache.org/jira/browse/SPARK-15154
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Pete Robbins
>  Labels: big-endian
>
> NPE in 
> org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap
> Error Message
> java.lang.NullPointerException was thrown.
> Stacktrace
>   java.lang.NullPointerException
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121)
>   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
>   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57)
>   at 
> org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
>   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
>   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
>   at 
> 

[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform

2016-05-06 Thread Pete Robbins (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885
 ] 

Pete Robbins edited comment on SPARK-15154 at 5/6/16 10:27 AM:
---

[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian whereas on 
Big Endian getInt returns "1" but get Long will return "268435456"

{quote}
val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long

map.append(key, unsafeRow)
  }
}
{quote}



was (Author: robbinspg):
[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian whereas on 
Big Endian getInt returns "1" but get Long will return "268435456"


```
val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long

map.append(key, unsafeRow)
  }
}
```


> LongHashedRelation fails on Big Endian platform
> ---
>
> Key: SPARK-15154
> URL: https://issues.apache.org/jira/browse/SPARK-15154
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Pete Robbins
>  Labels: big-endian
>
> NPE in 
> org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap
> Error Message
> java.lang.NullPointerException was thrown.
> Stacktrace
>   java.lang.NullPointerException
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121)
>   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
>   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57)
>   at 
> org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
>   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
>   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
>   at 
> 

[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform

2016-05-06 Thread Pete Robbins (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885
 ] 

Pete Robbins edited comment on SPARK-15154 at 5/6/16 10:24 AM:
---

[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian whereas on 
Big Endian getInt returns "1" but get Long will return "268435456"



val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long

map.append(key, unsafeRow)
  }
}




was (Author: robbinspg):
[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian whereas on 
Big Endian getInt returns "1" but get Long will return "268435456"



val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long
map.append(key, unsafeRow)
  }
}



> LongHashedRelation fails on Big Endian platform
> ---
>
> Key: SPARK-15154
> URL: https://issues.apache.org/jira/browse/SPARK-15154
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Pete Robbins
>  Labels: big-endian
>
> NPE in 
> org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap
> Error Message
> java.lang.NullPointerException was thrown.
> Stacktrace
>   java.lang.NullPointerException
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121)
>   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
>   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57)
>   at 
> org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
>   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
>   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
>   at 
> 

[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform

2016-05-06 Thread Pete Robbins (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885
 ] 

Pete Robbins edited comment on SPARK-15154 at 5/6/16 10:25 AM:
---

[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian whereas on 
Big Endian getInt returns "1" but get Long will return "268435456"


```
val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long

map.append(key, unsafeRow)
  }
}
```



was (Author: robbinspg):
[~davies] as you are the author of this code can you comment on my findings?

So the issue here is that the keyGenerator returns an UnsafeRow containing Int 
values but the code below from LongHashedRelation.apply retrieves the key from 
this as a Long. The bytes in the row are

on Little Endian: 01 00 00 00 00 00 00 00 
on Big Endian:   00 00 00 01 00 00 00 00

By chance getInt and getLong  will both return "1" on Little Endian whereas on 
Big Endian getInt returns "1" but get Long will return "268435456"



val keyGenerator = UnsafeProjection.create(key)

// Create a mapping of key -> rows
var numFields = 0
while (input.hasNext) {
  val unsafeRow = input.next().asInstanceOf[UnsafeRow]
  numFields = unsafeRow.numFields()
  val rowKey = keyGenerator(unsafeRow)
  if (!rowKey.isNullAt(0)) {
val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long

map.append(key, unsafeRow)
  }
}



> LongHashedRelation fails on Big Endian platform
> ---
>
> Key: SPARK-15154
> URL: https://issues.apache.org/jira/browse/SPARK-15154
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Pete Robbins
>  Labels: big-endian
>
> NPE in 
> org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap
> Error Message
> java.lang.NullPointerException was thrown.
> Stacktrace
>   java.lang.NullPointerException
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121)
>   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112)
>   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
>   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57)
>   at 
> org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
>   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
>   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
>   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
>   at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
>   at 
>