[jira] [Commented] (SPARK-12319) Address endian specific problems surfaced in 1.6

2015-12-29 Thread Tim Preece (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073801#comment-15073801
 ] 

Tim Preece commented on SPARK-12319:


Michael,
Since this JIRA's description is not quite right and involves two distinct 
problems, I have created a new JIRA 
https://issues.apache.org/jira/browse/SPARK-12555 to address the 
DatasetAggregatorSuite failure.

This is important to us since it causes an explicit build failure on our Big 
Endian platforms.

> Address endian specific problems surfaced in 1.6
> 
>
> Key: SPARK-12319
> URL: https://issues.apache.org/jira/browse/SPARK-12319
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
> Environment: Problems apparent on BE, LE could be impacted too
>Reporter: Adam Roberts
>Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed 
> problems with DataFrames on BE platforms, e.g. 
> https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and 
> com.google.common.io.LittleEndianDataOutputStream within UnsafeRowSerializer 
> fixes three test failures in ExchangeCoordinatorSuite but I'm concerned 
> around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
> with reordering" fails as we expect "one, 1" but instead get "one, 9" - we 
> believe the issue lies within BitSetMethods.java, specifically around: return 
> (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word); 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12319) Address endian specific problems surfaced in 1.6

2015-12-29 Thread Tim Preece (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073821#comment-15073821
 ] 

Tim Preece commented on SPARK-12319:


The remaining problem is ExchangeCoordinatorSuite. I don't have the right 
access to update the description or title.

> Address endian specific problems surfaced in 1.6
> 
>
> Key: SPARK-12319
> URL: https://issues.apache.org/jira/browse/SPARK-12319
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
> Environment: Problems apparent on BE, LE could be impacted too
>Reporter: Adam Roberts
>Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed 
> problems with DataFrames on BE platforms, e.g. 
> https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and 
> com.google.common.io.LittleEndianDataOutputStream within UnsafeRowSerializer 
> fixes three test failures in ExchangeCoordinatorSuite but I'm concerned 
> around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
> with reordering" fails as we expect "one, 1" but instead get "one, 9" - we 
> believe the issue lies within BitSetMethods.java, specifically around: return 
> (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word); 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12319) Address endian specific problems surfaced in 1.6

2015-12-29 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073813#comment-15073813
 ] 

Sean Owen commented on SPARK-12319:
---

[~preece] what's the remaining problem here then? you can edit the description 
and title to reflect it.

> Address endian specific problems surfaced in 1.6
> 
>
> Key: SPARK-12319
> URL: https://issues.apache.org/jira/browse/SPARK-12319
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
> Environment: Problems apparent on BE, LE could be impacted too
>Reporter: Adam Roberts
>Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed 
> problems with DataFrames on BE platforms, e.g. 
> https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and 
> com.google.common.io.LittleEndianDataOutputStream within UnsafeRowSerializer 
> fixes three test failures in ExchangeCoordinatorSuite but I'm concerned 
> around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
> with reordering" fails as we expect "one, 1" but instead get "one, 9" - we 
> believe the issue lies within BitSetMethods.java, specifically around: return 
> (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word); 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12319) Address endian specific problems surfaced in 1.6

2015-12-28 Thread Tim Preece (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073145#comment-15073145
 ] 

Tim Preece commented on SPARK-12319:


Hi,
The failing test is already checked in. It is:
"org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
with reordering"

The test only explicitly fails on Big Endian platforms. This is because an 
integer takes an 8 byte slot in the Unsafe row. When the data corruption occurs 
the BE integer ends up with the wrong value. I added print statements which 
shows the data corruption on Little Endian  as well, it just happens not to 
effect the value of the LE integer, since the LE integer is in the other 
4-bytes of the 8-byte slot.

> Address endian specific problems surfaced in 1.6
> 
>
> Key: SPARK-12319
> URL: https://issues.apache.org/jira/browse/SPARK-12319
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
> Environment: Problems apparent on BE, LE could be impacted too
>Reporter: Adam Roberts
>Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed 
> problems with DataFrames on BE platforms, e.g. 
> https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and 
> com.google.common.io.LittleEndianDataOutputStream within UnsafeRowSerializer 
> fixes three test failures in ExchangeCoordinatorSuite but I'm concerned 
> around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
> with reordering" fails as we expect "one, 1" but instead get "one, 9" - we 
> believe the issue lies within BitSetMethods.java, specifically around: return 
> (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word); 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12319) Address endian specific problems surfaced in 1.6

2015-12-28 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073099#comment-15073099
 ] 

Michael Armbrust commented on SPARK-12319:
--

Do you want to open a PR with your failing test case?

> Address endian specific problems surfaced in 1.6
> 
>
> Key: SPARK-12319
> URL: https://issues.apache.org/jira/browse/SPARK-12319
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
> Environment: Problems apparent on BE, LE could be impacted too
>Reporter: Adam Roberts
>Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed 
> problems with DataFrames on BE platforms, e.g. 
> https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and 
> com.google.common.io.LittleEndianDataOutputStream within UnsafeRowSerializer 
> fixes three test failures in ExchangeCoordinatorSuite but I'm concerned 
> around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
> with reordering" fails as we expect "one, 1" but instead get "one, 9" - we 
> believe the issue lies within BitSetMethods.java, specifically around: return 
> (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word); 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12319) Address endian specific problems surfaced in 1.6

2015-12-22 Thread Tim Preece (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15068176#comment-15068176
 ] 

Tim Preece commented on SPARK-12319:


[~marmbrus]
Hi Michael,
I think this may be a problem with the new DataSet API, in particular the new 
"as" function of DataFrame which I see is tagged as Experimental.

When we run the DatasetAggregatorSuite test "typed aggregation: class input 
with reordering" the implementation seems to get confused between the ordering 
of the data in the unsaferow (string,int) and the schema (int,string). This 
results in a testcase failure that shows up to BE platforms ( although the data 
is also corrupted on LE platforms ).

At the moment I'm not sure how to fix, so any pointers would be helpful.

> Address endian specific problems surfaced in 1.6
> 
>
> Key: SPARK-12319
> URL: https://issues.apache.org/jira/browse/SPARK-12319
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
> Environment: Problems apparent on BE, LE could be impacted too
>Reporter: Adam Roberts
>Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed 
> problems with DataFrames on BE platforms, e.g. 
> https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and 
> com.google.common.io.LittleEndianDataOutputStream within UnsafeRowSerializer 
> fixes three test failures in ExchangeCoordinatorSuite but I'm concerned 
> around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
> with reordering" fails as we expect "one, 1" but instead get "one, 9" - we 
> believe the issue lies within BitSetMethods.java, specifically around: return 
> (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word); 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12319) Address endian specific problems surfaced in 1.6

2015-12-21 Thread Tim Preece (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066628#comment-15066628
 ] 

Tim Preece commented on SPARK-12319:


I notice for the failing testcase the schema ( for row1) mismatches the actual 
data in row1.
Row1 has schema:
SpecificUnsafeRowJoiner schema1 
StructType(StructField(a,IntegerType,false), StructField(b,StringType,true))
But row 1 has the following data ( i.e. a string followed by int )
row1 [0,180003,1,656e6f]

So why doesn't the schema mismatch the data? 

The name of the failing test may give a clue!

test("typed aggregation: class input with reordering") {
val ds = sql("SELECT 'one' AS b, 1 as a").as[AggData]

checkAnswer(
  ds.select(ClassInputAgg.toColumn),
  1)

checkAnswer(
  ds.select(expr("avg(a)").as[Double], ClassInputAgg.toColumn),
  (1.0, 1))

checkAnswer(
  ds.groupBy(_.b).agg(ClassInputAgg.toColumn),
  ("one", 1))
  } 

> Address endian specific problems surfaced in 1.6
> 
>
> Key: SPARK-12319
> URL: https://issues.apache.org/jira/browse/SPARK-12319
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
> Environment: Problems apparent on BE, LE could be impacted too
>Reporter: Adam Roberts
>Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed 
> problems with DataFrames on BE platforms, e.g. 
> https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and 
> com.google.common.io.LittleEndianDataOutputStream within UnsafeRowSerializer 
> fixes three test failures in ExchangeCoordinatorSuite but I'm concerned 
> around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
> with reordering" fails as we expect "one, 1" but instead get "one, 9" - we 
> believe the issue lies within BitSetMethods.java, specifically around: return 
> (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word); 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12319) Address endian specific problems surfaced in 1.6

2015-12-17 Thread Tim Preece (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062175#comment-15062175
 ] 

Tim Preece commented on SPARK-12319:


Hi Sean, Yin
I've started to ( and continue to ) investigate this DatasetAggregatorSuite 
failure as described above.

So far I believe:
a) the description is incorrect and it has nothing to do with endianess or 
BitSetMethods.java. (It just happens we see a failure on bigendian platforms - 
see below)
b) the problem is probably in the codegen for unsaferow joins ( 
GenerateUnsafeRowJoiner ).

I see two Unsaferows being joined. A (string,int) + (string) which results in 
an Unsaferow with schema (string,int,string). 

When we come to update the offsets for the variable length data ( in this case 
for the first String ) the offset is miscalculated.
( in updateOffset in GenerateUnsafeRowJoiner )
This means the int value in the second field slot is wrongly changed, and on a 
BE platform (for this particular testcase) it is incremented by 8. 
On a LE platform the value in the second field is also changed, but in a way 
that does not alter the value of the int. However for both BE and LE platforms 
the first String variable looks bogus with an invalid variable offset.

I'm continuing to investigate ( and so could well revise the above ), but 
thought I would share my observations so far.

Also it would be useful if you happened to have a pointer to any design 
documentation for unsaferow. For example I wasn't sure if all the variable 
length data should go at the end of the row. That is the schema for the joined 
row should actually have been (int,string,string).

Tim Preece

> Address endian specific problems surfaced in 1.6
> 
>
> Key: SPARK-12319
> URL: https://issues.apache.org/jira/browse/SPARK-12319
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
> Environment: BE platforms
>Reporter: Adam Roberts
>Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed 
> problems with DataFrames on BE platforms, e.g. 
> https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and 
> com.google.common.io.LittleEndianDataOutputStream within UnsafeRowSerializer 
> fixes three test failures in ExchangeCoordinatorSuite but I'm concerned 
> around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
> with reordering" fails as we expect "one, 1" but instead get "one, 9" - we 
> believe the issue lies within BitSetMethods.java, specifically around: return 
> (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word); 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12319) Address endian specific problems surfaced in 1.6

2015-12-14 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055985#comment-15055985
 ] 

Sean Owen commented on SPARK-12319:
---

Do you have any more detail here -- what specifically is the test failure and 
fix?
You're referring to bit twiddling ops in BitSetMethods, but these operators 
don't have an endian-ness.

> Address endian specific problems surfaced in 1.6
> 
>
> Key: SPARK-12319
> URL: https://issues.apache.org/jira/browse/SPARK-12319
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
> Environment: BE platforms
>Reporter: Adam Roberts
>Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed 
> problems with DataFrames on BE platforms, e.g. 
> https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and 
> com.google.common.io.LittleEndianDataOutputStream within UnsafeRowSerializer 
> fixes three test failures in ExchangeCoordinatorSuite but I'm concerned 
> around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
> with reordering" fails as we expect "one, 1" but instead get "one, 9" - we 
> believe the issue lies within BitSetMethods.java, specifically around: return 
> (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word); 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12319) Address endian specific problems surfaced in 1.6

2015-12-14 Thread Adam Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056040#comment-15056040
 ] 

Adam Roberts commented on SPARK-12319:
--

Hi Sean, here are the failures

ExchangeCoordinatorSuite:
- test estimatePartitionStartIndices - 1 Exchange
- test estimatePartitionStartIndices - 2 Exchanges
- test estimatePartitionStartIndices and enforce minimal number of reducers
- determining the number of reducers: aggregate 
operator(minNumPostShufflePartitions: 3)
- determining the number of reducers: join 
operator(minNumPostShufflePartitions: 3)
- determining the number of reducers: complex query 
1(minNumPostShufflePartitions: 3)

- determining the number of reducers: complex query 
2(minNumPostShufflePartitions: 3)
- determining the number of reducers: aggregate operator *** FAILED ***
  3 did not equal 2 (ExchangeCoordinatorSuite.scala:315)
- determining the number of reducers: join operator *** FAILED ***
  1 did not equal 2 (ExchangeCoordinatorSuite.scala:366)
- determining the number of reducers: complex query 1
- determining the number of reducers: complex query 2 *** FAILED ***
  Set(2) did not equal Set(2, 3) (ExchangeCoordinatorSuite.scala:472)

The fix is to replace the use of DataInput/OutputStreams with 
LittleEndianDataInput/OutputStream objects in order to have these tests pass on 
big endian platforms

With regards to the Dataset failure (using DF behind the scenes and also using 
the tungsten optimised agg function), here's a snippet of the failing test 
output

  == Physical Plan ==
  TungstenAggregate(key=[value#1148], 
functions=[(ClassInputAgg$(b#1050,a#1051),mode=Final,isDistinct=false)], 
output=[value#1148,ClassInputAgg$(b,a)#1162])
   TungstenExchange (HashPartitioning 5), None
TungstenAggregate(key=[value#1148], 
functions=[(ClassInputAgg$(b#1050,a#1051),mode=Partial,isDistinct=false)], 
output=[value#1148,value#1158])
 !AppendColumns , class[a[0]: int, b[0]: string], 
class[value[0]: string], [value#1148]
  Project [one AS b#1050,1 AS a#1051]
   Scan OneRowRelation[]
  == Results ==
  !== Correct Answer - 1 ==   == Spark Answer - 1 ==
  ![one,1][one,9] (QueryTest.scala:127)

This is for the third checkAnswer call in the reordering test:

checkAnswer(
  ds.groupBy(_.b).agg(ClassInputAgg.toColumn),
  ("one", 1))

If we change our sql statement from 

val ds = sql("SELECT 'one' AS b, 1 as a").as[AggData]

so that a is, say, 2, we get 10. With 3, we get 11, etc.

> Address endian specific problems surfaced in 1.6
> 
>
> Key: SPARK-12319
> URL: https://issues.apache.org/jira/browse/SPARK-12319
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
> Environment: BE platforms
>Reporter: Adam Roberts
>Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed 
> problems with DataFrames on BE platforms, e.g. 
> https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and 
> com.google.common.io.LittleEndianDataOutputStream within UnsafeRowSerializer 
> fixes three test failures in ExchangeCoordinatorSuite but I'm concerned 
> around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
> with reordering" fails as we expect "one, 1" but instead get "one, 9" - we 
> believe the issue lies within BitSetMethods.java, specifically around: return 
> (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word); 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org