[jira] [Comment Edited] (SPARK-17788) RangePartitioner results in few very large tasks and many small to empty tasks

2016-11-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696321#comment-15696321 ] Herman van Hovell edited comment on SPARK-17788 at 11/25/16 5:09 PM: --

[jira] [Commented] (SPARK-17788) RangePartitioner results in few very large tasks and many small to empty tasks

2016-11-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696321#comment-15696321 ] Herman van Hovell commented on SPARK-17788: --- That is fair. The solution is not

[jira] [Comment Edited] (SPARK-17788) RangePartitioner results in few very large tasks and many small to empty tasks

2016-11-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696134#comment-15696134 ] Herman van Hovell edited comment on SPARK-17788 at 11/25/16 4:56 PM: --

[jira] [Comment Edited] (SPARK-17788) RangePartitioner results in few very large tasks and many small to empty tasks

2016-11-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696134#comment-15696134 ] Herman van Hovell edited comment on SPARK-17788 at 11/25/16 4:10 PM: --

[jira] [Updated] (SPARK-18220) ClassCastException occurs when using select query on ORC file

2016-11-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18220: -- Description: Error message is below. {noformat} ===

[jira] [Commented] (SPARK-17788) RangePartitioner results in few very large tasks and many small to empty tasks

2016-11-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696154#comment-15696154 ] Herman van Hovell commented on SPARK-17788: --- I am closing this one as a duplica

[jira] [Closed] (SPARK-17788) RangePartitioner results in few very large tasks and many small to empty tasks

2016-11-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-17788. - Resolution: Duplicate > RangePartitioner results in few very large tasks and many small t

[jira] [Commented] (SPARK-17788) RangePartitioner results in few very large tasks and many small to empty tasks

2016-11-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696134#comment-15696134 ] Herman van Hovell commented on SPARK-17788: --- Spark makes a sketch of your data

[jira] [Created] (SPARK-18588) KafkaSourceStressForDontFailOnDataLossSuite is flaky

2016-11-25 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-18588: - Summary: KafkaSourceStressForDontFailOnDataLossSuite is flaky Key: SPARK-18588 URL: https://issues.apache.org/jira/browse/SPARK-18588 Project: Spark

[jira] [Commented] (SPARK-18588) KafkaSourceStressForDontFailOnDataLossSuite is flaky

2016-11-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15695975#comment-15695975 ] Herman van Hovell commented on SPARK-18588: --- cc [~zsxwing] > KafkaSourceStress

[jira] [Updated] (SPARK-18538) Concurrent Fetching DataFrameReader JDBC APIs Do Not Work

2016-11-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18538: -- Priority: Blocker (was: Critical) > Concurrent Fetching DataFrameReader JDBC APIs Do N

[jira] [Commented] (SPARK-17251) "ClassCastException: OuterReference cannot be cast to NamedExpression" for correlated subquery on the RHS of an IN operator

2016-11-24 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15694400#comment-15694400 ] Herman van Hovell commented on SPARK-17251: --- Yeah, go ahead. The only thing is

[jira] [Resolved] (SPARK-18578) Full outer join in correlated subquery returns incorrect results

2016-11-24 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18578. --- Resolution: Fixed Assignee: Nattavut Sutyanyong Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-18134) SQL: MapType in Group BY and Joins not working

2016-11-24 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15693519#comment-15693519 ] Herman van Hovell commented on SPARK-18134: --- [~ChrisZ84] The current PR a proof

[jira] [Commented] (SPARK-18075) UDF doesn't work on non-local spark

2016-11-24 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15693302#comment-15693302 ] Herman van Hovell commented on SPARK-18075: --- Any idea what causes it? > UDF do

[jira] [Comment Edited] (SPARK-18549) Failed to Uncache a View that References a Dropped Table.

2016-11-23 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691690#comment-15691690 ] Herman van Hovell edited comment on SPARK-18549 at 11/23/16 11:48 PM: -

[jira] [Commented] (SPARK-18549) Failed to Uncache a View that References a Dropped Table.

2016-11-23 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691690#comment-15691690 ] Herman van Hovell commented on SPARK-18549: --- That only works for the sqlContext

[jira] [Resolved] (SPARK-18557) Downgrade the memory leak warning message

2016-11-23 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18557. --- Resolution: Fixed Fix Version/s: 2.1.0 > Downgrade the memory leak warning mes

[jira] [Updated] (SPARK-18519) map type can not be used in EqualTo

2016-11-23 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18519: -- Fix Version/s: 2.0.3 > map type can not be used in EqualTo > --

[jira] [Resolved] (SPARK-18053) ARRAY equality is broken in Spark 2.0

2016-11-23 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18053. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.3 > ARRAY equali

[jira] [Updated] (SPARK-15380) Generate code that stores a float/double value in each column from ColumnarBatch when DataFrame.cache() is used

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-15380: -- Target Version/s: 2.2.0 (was: 2.1.0) > Generate code that stores a float/double value

[jira] [Updated] (SPARK-15117) Generate code that get a value in each compressed column from CachedBatch when DataFrame.cache() is called

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-15117: -- Target Version/s: 2.2.0 (was: 2.1.0) > Generate code that get a value in each compress

[jira] [Closed] (SPARK-18550) Make the queue capacity of LiveListenerBus configurable.

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-18550. - Resolution: Duplicate > Make the queue capacity of LiveListenerBus configurable. > --

[jira] [Commented] (SPARK-18394) Executing the same query twice in a row results in CodeGenerator cache misses

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15688345#comment-15688345 ] Herman van Hovell commented on SPARK-18394: --- Great! Ping me if you need any ass

[jira] [Commented] (SPARK-18394) Executing the same query twice in a row results in CodeGenerator cache misses

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15688330#comment-15688330 ] Herman van Hovell commented on SPARK-18394: --- Nice, this is a good find. I thin

[jira] [Updated] (SPARK-18394) Executing the same query twice in a row results in CodeGenerator cache misses

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18394: -- Shepherd: Herman van Hovell Target Version/s: 2.2.0 > Executing the same qu

[jira] [Updated] (SPARK-18169) Suppress warnings when dropping views on a dropped table

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18169: -- Target Version/s: 2.1.0 > Suppress warnings when dropping views on a dropped table > --

[jira] [Updated] (SPARK-18394) Executing the same query twice in a row results in CodeGenerator cache misses

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18394: -- Priority: Major (was: Minor) > Executing the same query twice in a row results in Code

[jira] [Commented] (SPARK-18394) Executing the same query twice in a row results in CodeGenerator cache misses

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15688225#comment-15688225 ] Herman van Hovell commented on SPARK-18394: --- Ok, that is fair. What strikes me

[jira] [Commented] (SPARK-18394) Executing the same query twice in a row results in CodeGenerator cache misses

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15688059#comment-15688059 ] Herman van Hovell commented on SPARK-18394: --- I am not able to reproduce this. C

[jira] [Updated] (SPARK-18394) Executing the same query twice in a row results in CodeGenerator cache misses

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18394: -- Priority: Minor (was: Major) > Executing the same query twice in a row results in Code

[jira] [Updated] (SPARK-18394) Executing the same query twice in a row results in CodeGenerator cache misses

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18394: -- Target Version/s: (was: 2.1.0) > Executing the same query twice in a row results in C

[jira] [Resolved] (SPARK-18465) Uncache Table shouldn't throw an exception when table doesn't exist

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18465. --- Resolution: Fixed Assignee: Burak Yavuz Fix Version/s: 2.1.0 > Uncach

[jira] [Commented] (SPARK-13649) Move CalendarInterval out of unsafe package

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687878#comment-15687878 ] Herman van Hovell commented on SPARK-13649: --- I have to agree with reynold. We r

[jira] [Resolved] (SPARK-18504) Scalar subquery with extra group by columns returning incorrect result

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18504. --- Resolution: Fixed Assignee: Nattavut Sutyanyong Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-12469) Data Property Accumulators for Spark (formerly Consistent Accumulators)

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687587#comment-15687587 ] Herman van Hovell commented on SPARK-12469: --- [~holdenk] We are really close to

[jira] [Updated] (SPARK-17772) Add helper testing methods for instance weighting

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-17772: -- Target Version/s: 2.2.0 (was: 2.1.0) > Add helper testing methods for instance weighti

[jira] [Commented] (SPARK-17772) Add helper testing methods for instance weighting

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687385#comment-15687385 ] Herman van Hovell commented on SPARK-17772: --- [~sethah] Shall we push this to 2.

[jira] [Updated] (SPARK-17637) Packed scheduling for Spark tasks across executors

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-17637: -- Target Version/s: 2.2.0 (was: 2.1.0) > Packed scheduling for Spark tasks across execut

[jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687369#comment-15687369 ] Herman van Hovell commented on SPARK-17637: --- I am going to push this to 2.2. >

[jira] [Commented] (SPARK-16973) remove the buffer offsets in ImperativeAggregate

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687345#comment-15687345 ] Herman van Hovell commented on SPARK-16973: --- I am going to close this. We will

[jira] [Updated] (SPARK-17528) MutableProjection should not cache content from the input row

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-17528: -- Target Version/s: 2.2.0 (was: 2.1.0) > MutableProjection should not cache content from

[jira] [Commented] (SPARK-17528) MutableProjection should not cache content from the input row

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687351#comment-15687351 ] Herman van Hovell commented on SPARK-17528: --- [~cloud_fan] I am retargeting this

[jira] [Closed] (SPARK-16973) remove the buffer offsets in ImperativeAggregate

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-16973. - Resolution: Won't Fix > remove the buffer offsets in ImperativeAggregate > --

[jira] [Resolved] (SPARK-18519) map type can not be used in EqualTo

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18519. --- Resolution: Fixed Fix Version/s: 2.1.0 > map type can not be used in EqualTo >

[jira] [Closed] (SPARK-18358) Multiple Aggregation Using 'countDistinct' and 'first' result in error

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-18358. - Resolution: Duplicate Fix Version/s: 2.0.2 > Multiple Aggregation Using 'countDist

[jira] [Commented] (SPARK-18358) Multiple Aggregation Using 'countDistinct' and 'first' result in error

2016-11-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687141#comment-15687141 ] Herman van Hovell commented on SPARK-18358: --- This has been fixed in Spark 2.0.2

[jira] [Commented] (SPARK-18403) ObjectHashAggregateSuite is being flaky (occasional OOM errors)

2016-11-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15684928#comment-15684928 ] Herman van Hovell commented on SPARK-18403: --- The 5a5a5a5a5a5a means that the pa

[jira] [Commented] (SPARK-18532) Code generation memory issue

2016-11-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15684908#comment-15684908 ] Herman van Hovell commented on SPARK-18532: --- The code generated by whole stage

[jira] [Resolved] (SPARK-18398) Fix nullabilities of MapObjects and optimize not to check null if lambda is not nullable.

2016-11-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18398. --- Resolution: Fixed Assignee: Takuya Ueshin Fix Version/s: 2.1.0 > Fix

[jira] [Updated] (SPARK-17732) ALTER TABLE DROP PARTITION should support comparators

2016-11-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-17732: -- Fix Version/s: (was: 2.1.0) 2.2.0 > ALTER TABLE DROP PARTITION s

[jira] [Commented] (SPARK-18515) AlterTableDropPartitions fails for non-string columns

2016-11-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15681605#comment-15681605 ] Herman van Hovell commented on SPARK-18515: --- The Analyzer is injecting casts be

[jira] [Commented] (SPARK-18515) AlterTableDropPartitions fails for non-string columns

2016-11-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15681435#comment-15681435 ] Herman van Hovell commented on SPARK-18515: --- [~dongjoon] I am reverting this fr

[jira] [Created] (SPARK-18515) AlterTableDropPartitions fails for non-string columns

2016-11-20 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-18515: - Summary: AlterTableDropPartitions fails for non-string columns Key: SPARK-18515 URL: https://issues.apache.org/jira/browse/SPARK-18515 Project: Spark

[jira] [Resolved] (SPARK-16998) select($"column1", explode($"column2")) is extremely slow

2016-11-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-16998. --- Resolution: Fixed Assignee: Herman van Hovell Fix Version/s: 2.2.0 >

[jira] [Commented] (SPARK-18504) Scalar subquery with extra group by columns returning incorrect result

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677849#comment-15677849 ] Herman van Hovell commented on SPARK-18504: --- Could you open a PR? > Scalar sub

[jira] [Commented] (SPARK-18504) Scalar subquery with extra group by columns returning incorrect result

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677848#comment-15677848 ] Herman van Hovell commented on SPARK-18504: --- Is this a valid correlated scalar

[jira] [Closed] (SPARK-11785) When deployed against remote Hive metastore with lower versions, JDBC metadata calls throws exception

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-11785. - Resolution: Fixed Fix Version/s: 2.1.0 > When deployed against remote Hive metasto

[jira] [Commented] (SPARK-18134) SQL: MapType in Group BY and Joins not working

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677266#comment-15677266 ] Herman van Hovell commented on SPARK-18134: --- There is not a political reason fo

[jira] [Commented] (SPARK-18134) SQL: MapType in Group BY and Joins not working

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677252#comment-15677252 ] Herman van Hovell commented on SPARK-18134: --- In both cases you could use sorted

[jira] [Commented] (SPARK-18249) StackOverflowError when saving dataset to parquet

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677115#comment-15677115 ] Herman van Hovell commented on SPARK-18249: --- The good news is that this is not

[jira] [Closed] (SPARK-17450) spark sql rownumber OOM

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-17450. - Resolution: Not A Problem > spark sql rownumber OOM > --- > >

[jira] [Commented] (SPARK-18004) DataFrame filter Predicate push-down fails for Oracle Timestamp type columns

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673904#comment-15673904 ] Herman van Hovell commented on SPARK-18004: --- which format should be passed to o

[jira] [Commented] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673884#comment-15673884 ] Herman van Hovell commented on SPARK-18489: --- Why is a UDF a problem? This will

[jira] [Commented] (SPARK-18491) Spark uses mutable classes for date/time types mapping

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673780#comment-15673780 ] Herman van Hovell commented on SPARK-18491: --- That is true. It is current not tr

[jira] [Commented] (SPARK-18491) Spark uses mutable classes for date/time types mapping

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673701#comment-15673701 ] Herman van Hovell commented on SPARK-18491: --- Good library/API is also all about

[jira] [Commented] (SPARK-18491) Spark uses mutable classes for date/time types mapping

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673693#comment-15673693 ] Herman van Hovell commented on SPARK-18491: --- The only problem you can possibly

[jira] [Commented] (SPARK-18491) Spark uses mutable classes for date/time types mapping

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673639#comment-15673639 ] Herman van Hovell commented on SPARK-18491: --- There is no absolutely no chance o

[jira] [Commented] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673541#comment-15673541 ] Herman van Hovell commented on SPARK-18489: --- It should handle all implicit stri

[jira] [Comment Edited] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673528#comment-15673528 ] Herman van Hovell edited comment on SPARK-18489 at 11/17/16 12:03 PM: -

[jira] [Updated] (SPARK-18490) duplicate nodename extrainfo of ShuffleExchange

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18490: -- Description: {noformat} override def nodeName: String = { val extraInfo = coordin

[jira] [Closed] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-18489. - Resolution: Duplicate > Implicit type conversion during comparision between Integer type

[jira] [Commented] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673509#comment-15673509 ] Herman van Hovell commented on SPARK-18489: --- In these case we cast everything t

[jira] [Commented] (SPARK-18490) duplicate nodename extrainfo of ShuffleExchange

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673511#comment-15673511 ] Herman van Hovell commented on SPARK-18490: --- Why is this a bug? > duplicate no

[jira] [Updated] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18489: -- Description: Suppose I have a dataframe with schema: {noformat} root |-- _c0: integer

[jira] [Closed] (SPARK-18473) Correctness issue in INNER join result with window functions

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-18473. - Resolution: Fixed Assignee: Xiao Li Fixed by gatorsmile's PR for SPARK-17981/SPARK-

[jira] [Commented] (SPARK-18473) Correctness issue in INNER join result with window functions

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15672223#comment-15672223 ] Herman van Hovell commented on SPARK-18473: --- This is probably caused by SPARK-1

[jira] [Commented] (SPARK-18473) Correctness issue in INNER join result with window functions

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671609#comment-15671609 ] Herman van Hovell commented on SPARK-18473: --- This has been fixed in spark 2.0.2

[jira] [Comment Edited] (SPARK-18473) Correctness issue in INNER join result with window functions

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671609#comment-15671609 ] Herman van Hovell edited comment on SPARK-18473 at 11/16/16 8:59 PM: --

[jira] [Closed] (SPARK-16795) Spark's HiveThriftServer should be able to use multiple sqlContexts

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-16795. - Resolution: Duplicate > Spark's HiveThriftServer should be able to use multiple sqlContex

[jira] [Commented] (SPARK-16795) Spark's HiveThriftServer should be able to use multiple sqlContexts

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671462#comment-15671462 ] Herman van Hovell commented on SPARK-16795: --- Spark uses one Hive client per spa

[jira] [Resolved] (SPARK-16865) A file-based end-to-end SQL query suite

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-16865. --- Resolution: Fixed Assignee: Peter Lee Fix Version/s: 2.0.1

[jira] [Updated] (SPARK-16951) Alternative implementation of NOT IN to Anti-join

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-16951: -- Issue Type: Sub-task (was: Improvement) Parent: SPARK-18455 > Alternative impl

[jira] [Resolved] (SPARK-17268) Break Optimizer.scala apart

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-17268. --- Resolution: Fixed Fix Version/s: 2.1.0 > Break Optimizer.scala apart > ---

[jira] [Commented] (SPARK-17450) spark sql rownumber OOM

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671411#comment-15671411 ] Herman van Hovell commented on SPARK-17450: --- [~cenyuhai] did you have any luck

[jira] [Closed] (SPARK-17662) Dedup UDAF

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-17662. - Resolution: Not A Problem > Dedup UDAF > -- > > Key: SPARK-17662

[jira] [Commented] (SPARK-17662) Dedup UDAF

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671401#comment-15671401 ] Herman van Hovell commented on SPARK-17662: --- This is more of a question for the

[jira] [Commented] (SPARK-18172) AnalysisException in first/last during aggregation

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671361#comment-15671361 ] Herman van Hovell commented on SPARK-18172: --- This is different from SPARK-18300

[jira] [Resolved] (SPARK-18172) AnalysisException in first/last during aggregation

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18172. --- Resolution: Fixed Fix Version/s: 2.0.2 Target Version/s: (was: 2.1.

[jira] [Updated] (SPARK-17786) [SPARK 2.0] Sorting algorithm gives higher skewness of output

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-17786: -- Target Version/s: 2.1.0 > [SPARK 2.0] Sorting algorithm gives higher skewness of output

[jira] [Updated] (SPARK-17788) RangePartitioner results in few very large tasks and many small to empty tasks

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-17788: -- Target Version/s: 2.1.0 > RangePartitioner results in few very large tasks and many sma

[jira] [Commented] (SPARK-17932) Failed to run SQL "show table extended like table_name" in Spark2.0.0

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671331#comment-15671331 ] Herman van Hovell commented on SPARK-17932: --- This is currently not implemented

[jira] [Updated] (SPARK-17897) not isnotnull is converted to the always false condition isnotnull && not isnotnull

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-17897: -- Target Version/s: 2.1.0 > not isnotnull is converted to the always false condition isno

[jira] [Comment Edited] (SPARK-17977) DataFrameReader and DataStreamReader should have an ancestor class

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671316#comment-15671316 ] Herman van Hovell edited comment on SPARK-17977 at 11/16/16 7:07 PM: --

[jira] [Commented] (SPARK-17977) DataFrameReader and DataStreamReader should have an ancestor class

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671316#comment-15671316 ] Herman van Hovell commented on SPARK-17977: --- [~aassudani] want to open a PR for

[jira] [Commented] (SPARK-18458) core dumped running Spark SQL on large data volume (100TB)

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671314#comment-15671314 ] Herman van Hovell commented on SPARK-18458: --- Nice find! > core dumped running

[jira] [Resolved] (SPARK-7712) Window Function Improvements

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-7712. -- Resolution: Fixed Fix Version/s: 2.0.0 > Window Function Improvements > -

[jira] [Commented] (SPARK-18098) Broadcast creates 1 instance / core, not 1 instance / executor

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671049#comment-15671049 ] Herman van Hovell commented on SPARK-18098: --- I think this is caused by how you

[jira] [Updated] (SPARK-18118) SpecificSafeProjection.apply of Java Object from Dataset to JavaRDD Grows Beyond 64 KB

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18118: -- Target Version/s: 2.1.0 > SpecificSafeProjection.apply of Java Object from Dataset to J

[jira] [Commented] (SPARK-18165) Kinesis support in Structured Streaming

2016-11-16 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671010#comment-15671010 ] Herman van Hovell commented on SPARK-18165: --- [~maropu] wrote something for this

<    1   2   3   4   5   6   7   8   9   10   >