[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform
[ https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885 ] Pete Robbins edited comment on SPARK-15154 at 5/6/16 11:20 AM: --- [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian because the following 4 bytes happen to be 0, whereas on Big Endian getInt returns "1" but get Long will return "268435456" {code} val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Ints not Longs map.append(key, unsafeRow) } } {code} was (Author: robbinspg): [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian because the following 4 bytes happen to be 0, whereas on Big Endian getInt returns "1" but get Long will return "268435456" {code} val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Intsnot Longs map.append(key, unsafeRow) } } {code} > LongHashedRelation fails on Big Endian platform > --- > > Key: SPARK-15154 > URL: https://issues.apache.org/jira/browse/SPARK-15154 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Pete Robbins > Labels: big-endian > > NPE in > org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap > Error Message > java.lang.NullPointerException was thrown. > Stacktrace > java.lang.NullPointerException > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) > at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) > at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57) > at > org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) > at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) > at org.scalatest.FunSuite.runTest(FunSuite.scala:1555) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at
[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform
[ https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885 ] Pete Robbins edited comment on SPARK-15154 at 5/6/16 11:14 AM: --- [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian because the following 4 bytes happen to be 0, whereas on Big Endian getInt returns "1" but get Long will return "268435456" {code} val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Intsnot Longs map.append(key, unsafeRow) } } {code} was (Author: robbinspg): [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian whereas on Big Endian getInt returns "1" but get Long will return "268435456" {code} val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Intsnot Longs map.append(key, unsafeRow) } } {code} > LongHashedRelation fails on Big Endian platform > --- > > Key: SPARK-15154 > URL: https://issues.apache.org/jira/browse/SPARK-15154 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Pete Robbins > Labels: big-endian > > NPE in > org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap > Error Message > java.lang.NullPointerException was thrown. > Stacktrace > java.lang.NullPointerException > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) > at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) > at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57) > at > org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) > at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) > at org.scalatest.FunSuite.runTest(FunSuite.scala:1555) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at
[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform
[ https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273897#comment-15273897 ] Pete Robbins edited comment on SPARK-15154 at 5/6/16 11:13 AM: --- Is this just a testcase issue where in HashedRelationSuite {code} val key = Seq(BoundReference(0, IntegerType, false)) {code} should be {code} val key = Seq(BoundReference(0, LongType, false)) {code} Ans: No, still fails with that change. was (Author: robbinspg): Is this just a testcase issue where in HashedRelationSuite {code} val key = Seq(BoundReference(0, IntegerType, false)) {code} should be {code} val key = Seq(BoundReference(0, LongType, false)) {code} > LongHashedRelation fails on Big Endian platform > --- > > Key: SPARK-15154 > URL: https://issues.apache.org/jira/browse/SPARK-15154 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Pete Robbins > Labels: big-endian > > NPE in > org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap > Error Message > java.lang.NullPointerException was thrown. > Stacktrace > java.lang.NullPointerException > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) > at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) > at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57) > at > org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) > at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) > at org.scalatest.FunSuite.runTest(FunSuite.scala:1555) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) > at > org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396) > at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483) > at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208) > at org.scalatest.FunSuite.runTests(FunSuite.scala:1555) > at org.scalatest.Suite$class.run(Suite.scala:1424) > at > org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555) > at > org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212) > at > org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212) > at org.scalatest.SuperEngine.runImpl(Engine.scala:545) > at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212) > at > org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:29) > at > org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:257) > at > org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:256) > at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:29) > at org.scalatest.Suite$class.callExecuteOnSuite$1(Suite.scala:1492) > at > org.scalatest.Suite$$anonfun$runNestedSuites$1.apply(Suite.scala:1528) > at > org.scalatest.Suite$$anonfun$runNestedSuites$1.apply(Suite.scala:1526) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at
[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform
[ https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885 ] Pete Robbins edited comment on SPARK-15154 at 5/6/16 10:28 AM: --- [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian whereas on Big Endian getInt returns "1" but get Long will return "268435456" {code} val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Intsnot Longs map.append(key, unsafeRow) } } {code} was (Author: robbinspg): [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian whereas on Big Endian getInt returns "1" but get Long will return "268435456" {quote} val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long map.append(key, unsafeRow) } } {quote} > LongHashedRelation fails on Big Endian platform > --- > > Key: SPARK-15154 > URL: https://issues.apache.org/jira/browse/SPARK-15154 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Pete Robbins > Labels: big-endian > > NPE in > org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap > Error Message > java.lang.NullPointerException was thrown. > Stacktrace > java.lang.NullPointerException > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) > at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) > at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57) > at > org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) > at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) > at org.scalatest.FunSuite.runTest(FunSuite.scala:1555) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) > at >
[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform
[ https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885 ] Pete Robbins edited comment on SPARK-15154 at 5/6/16 10:27 AM: --- [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian whereas on Big Endian getInt returns "1" but get Long will return "268435456" {quote} val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long map.append(key, unsafeRow) } } {quote} was (Author: robbinspg): [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian whereas on Big Endian getInt returns "1" but get Long will return "268435456" ``` val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long map.append(key, unsafeRow) } } ``` > LongHashedRelation fails on Big Endian platform > --- > > Key: SPARK-15154 > URL: https://issues.apache.org/jira/browse/SPARK-15154 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Pete Robbins > Labels: big-endian > > NPE in > org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap > Error Message > java.lang.NullPointerException was thrown. > Stacktrace > java.lang.NullPointerException > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) > at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) > at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57) > at > org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) > at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) > at org.scalatest.FunSuite.runTest(FunSuite.scala:1555) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) > at >
[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform
[ https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885 ] Pete Robbins edited comment on SPARK-15154 at 5/6/16 10:24 AM: --- [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian whereas on Big Endian getInt returns "1" but get Long will return "268435456" val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long map.append(key, unsafeRow) } } was (Author: robbinspg): [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian whereas on Big Endian getInt returns "1" but get Long will return "268435456" val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long map.append(key, unsafeRow) } } > LongHashedRelation fails on Big Endian platform > --- > > Key: SPARK-15154 > URL: https://issues.apache.org/jira/browse/SPARK-15154 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Pete Robbins > Labels: big-endian > > NPE in > org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap > Error Message > java.lang.NullPointerException was thrown. > Stacktrace > java.lang.NullPointerException > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) > at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) > at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57) > at > org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) > at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) > at org.scalatest.FunSuite.runTest(FunSuite.scala:1555) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) > at >
[jira] [Comment Edited] (SPARK-15154) LongHashedRelation fails on Big Endian platform
[ https://issues.apache.org/jira/browse/SPARK-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273885#comment-15273885 ] Pete Robbins edited comment on SPARK-15154 at 5/6/16 10:25 AM: --- [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian whereas on Big Endian getInt returns "1" but get Long will return "268435456" ``` val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long map.append(key, unsafeRow) } } ``` was (Author: robbinspg): [~davies] as you are the author of this code can you comment on my findings? So the issue here is that the keyGenerator returns an UnsafeRow containing Int values but the code below from LongHashedRelation.apply retrieves the key from this as a Long. The bytes in the row are on Little Endian: 01 00 00 00 00 00 00 00 on Big Endian: 00 00 00 01 00 00 00 00 By chance getInt and getLong will both return "1" on Little Endian whereas on Big Endian getInt returns "1" but get Long will return "268435456" val keyGenerator = UnsafeProjection.create(key) // Create a mapping of key -> rows var numFields = 0 while (input.hasNext) { val unsafeRow = input.next().asInstanceOf[UnsafeRow] numFields = unsafeRow.numFields() val rowKey = keyGenerator(unsafeRow) if (!rowKey.isNullAt(0)) { val key = rowKey.getLong(0) // <<< Values in rowKey are Int not Long map.append(key, unsafeRow) } } > LongHashedRelation fails on Big Endian platform > --- > > Key: SPARK-15154 > URL: https://issues.apache.org/jira/browse/SPARK-15154 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Pete Robbins > Labels: big-endian > > NPE in > org.apache.spark.sql.execution.joins.HashedRelationSuite.LongToUnsafeRowMap > Error Message > java.lang.NullPointerException was thrown. > Stacktrace > java.lang.NullPointerException > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(HashedRelationSuite.scala:121) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply$mcV$sp(HashedRelationSuite.scala:119) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.apache.spark.sql.execution.joins.HashedRelationSuite$$anonfun$3.apply(HashedRelationSuite.scala:112) > at > org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) > at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) > at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:57) > at > org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) > at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) > at org.scalatest.FunSuite.runTest(FunSuite.scala:1555) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) > at >