[jira] [Commented] (SPARK-28304) FileFormatWriter introduces an uncoditional sort, even when all attributes are constants

2019-07-20 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16889436#comment-16889436 ] Eyal Farago commented on SPARK-28304: - [~joshrosen], thanks for your comment, I think this is a bit

[jira] [Commented] (SPARK-28304) FileFormatWriter introduces an uncoditional sort, even when all attributes are constants

2019-07-08 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880402#comment-16880402 ] Eyal Farago commented on SPARK-28304: - cc [~cloud_fan],[~hvanhovell] > FileFormatWriter introduces

[jira] [Updated] (SPARK-28304) FileFormatWriter introduces an uncoditional sort, even when all attributes are constants

2019-07-08 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eyal Farago updated SPARK-28304: Summary: FileFormatWriter introduces an uncoditional sort, even when all attributes are constants

[jira] [Created] (SPARK-28304) FileFormatWriter introduces an uncoditional join, even when all attributes are constants

2019-07-08 Thread Eyal Farago (JIRA)
Eyal Farago created SPARK-28304: --- Summary: FileFormatWriter introduces an uncoditional join, even when all attributes are constants Key: SPARK-28304 URL: https://issues.apache.org/jira/browse/SPARK-28304

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2019-05-15 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840498#comment-16840498 ] Eyal Farago commented on SPARK-24437: - [~mgaido], looking at this again I suspect in this case

[jira] [Commented] (SPARK-17556) Executor side broadcast for broadcast joins

2019-03-08 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788206#comment-16788206 ] Eyal Farago commented on SPARK-17556: - why was this abandoned? [~viirya]'s pull request seems

[jira] [Commented] (SPARK-22579) BlockManager.getRemoteValues and BlockManager.getRemoteBytes should be implemented using streaming

2018-12-28 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730301#comment-16730301 ] Eyal Farago commented on SPARK-22579: - Glanced over this on my cell, seems like spark-25905 only

[jira] [Commented] (SPARK-17403) Fatal Error: Scan cached strings

2018-11-10 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682591#comment-16682591 ] Eyal Farago commented on SPARK-17403: - [~paul_lysak], [~hvanhovell], please notice the exception

[jira] [Updated] (SPARK-17403) Fatal Error: Scan cached strings

2018-11-10 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eyal Farago updated SPARK-17403: Description: The process creates views from JDBC (SQL server) source and combines them to create

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-11-08 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679840#comment-16679840 ] Eyal Farago commented on SPARK-24437: - [~dvogelbacher], I think what you actually want is somewhat

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-11-08 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679560#comment-16679560 ] Eyal Farago commented on SPARK-24437: - [~dvogelbacher], what about the _checkpoint_ approach?

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-11-05 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675719#comment-16675719 ] Eyal Farago commented on SPARK-24437: - [~dvogelbacher] this is a bit puzzling, spark will usually

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-11-05 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675347#comment-16675347 ] Eyal Farago commented on SPARK-24437: - [~mgaido], I think we agree on most of the details :) I do

[jira] [Commented] (SPARK-25548) In the PruneFileSourcePartitions optimizer, replace the nonPartitionOps field with true in the And(partitionOps, nonPartitionOps) to make the partition can be pruned

2018-11-03 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673957#comment-16673957 ] Eyal Farago commented on SPARK-25548: - [~eaton], I think there are two possible approaches to handle

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-11-03 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673949#comment-16673949 ] Eyal Farago commented on SPARK-24437: - [~dvogelbacher], haven't looked to deep into this, but

[jira] [Commented] (SPARK-24410) Missing optimization for Union on bucketed tables

2018-09-14 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614683#comment-16614683 ] Eyal Farago commented on SPARK-24410: - [~viirya], I see that the PR is now closed (postponed) due to

[jira] [Commented] (SPARK-24410) Missing optimization for Union on bucketed tables

2018-09-14 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614678#comment-16614678 ] Eyal Farago commented on SPARK-24410: - [~viirya], I've opened SPARK-25203 because of your answer on

[jira] [Commented] (SPARK-25203) spark sql, union all does not propagate child partitioning (when possible)

2018-08-22 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589356#comment-16589356 ] Eyal Farago commented on SPARK-25203: - seems I was wrong regarding the resulting distribution,

[jira] [Commented] (SPARK-25203) spark sql, union all does not propagate child partitioning (when possible)

2018-08-22 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589332#comment-16589332 ] Eyal Farago commented on SPARK-25203: - CC: [~hvanhovell], [~cloud_fan] > spark sql, union all does

[jira] [Created] (SPARK-25203) spark sql, union all does not propagate child partitioning (when possible)

2018-08-22 Thread Eyal Farago (JIRA)
Eyal Farago created SPARK-25203: --- Summary: spark sql, union all does not propagate child partitioning (when possible) Key: SPARK-25203 URL: https://issues.apache.org/jira/browse/SPARK-25203 Project:

[jira] [Commented] (SPARK-24410) Missing optimization for Union on bucketed tables

2018-08-13 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578789#comment-16578789 ] Eyal Farago commented on SPARK-24410: - [~viirya], my bad :) seems there are two distinct issues

[jira] [Commented] (SPARK-25103) CompletionIterator may delay GC of completed resources

2018-08-13 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578519#comment-16578519 ] Eyal Farago commented on SPARK-25103: - CC: [~cloud_fan], [~hvanhovell] > CompletionIterator may

[jira] [Created] (SPARK-25103) CompletionIterator may delay GC of completed resources

2018-08-13 Thread Eyal Farago (JIRA)
Eyal Farago created SPARK-25103: --- Summary: CompletionIterator may delay GC of completed resources Key: SPARK-25103 URL: https://issues.apache.org/jira/browse/SPARK-25103 Project: Spark Issue

[jira] [Commented] (SPARK-24410) Missing optimization for Union on bucketed tables

2018-08-13 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578489#comment-16578489 ] Eyal Farago commented on SPARK-24410: - [~viirya], I think your conclusion about co-partitioning is

[jira] [Commented] (SPARK-22713) OOM caused by the memory contention and memory leak in TaskMemoryManager

2018-08-13 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578322#comment-16578322 ] Eyal Farago commented on SPARK-22713: - [~jerrylead], can you please test this? > OOM caused by the

[jira] [Commented] (SPARK-22286) OutOfMemoryError caused by memory leak and large serializer batch size in ExternalAppendOnlyMap

2018-05-20 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16481961#comment-16481961 ] Eyal Farago commented on SPARK-22286: - [~jerrylead], don't you think actual root cause here is the

[jira] [Commented] (SPARK-22713) OOM caused by the memory contention and memory leak in TaskMemoryManager

2018-05-20 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16481881#comment-16481881 ] Eyal Farago commented on SPARK-22713: - [~jerrylead], what's the relation to SPARK-22286? does solving

[jira] [Commented] (SPARK-22713) OOM caused by the memory contention and memory leak in TaskMemoryManager

2018-05-18 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16481468#comment-16481468 ] Eyal Farago commented on SPARK-22713: - [~jerrylead], excellent investigation and description of the

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2018-04-26 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453625#comment-16453625 ] Eyal Farago commented on SPARK-21109: - I've also encountered this issue in two separate ways, in one

[jira] [Commented] (SPARK-19870) Repeatable deadlock on BlockInfoManager and TorrentBroadcast

2018-02-09 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358668#comment-16358668 ] Eyal Farago commented on SPARK-19870: - I'll remember to share relevant future logs Re. The exception

[jira] [Commented] (SPARK-19870) Repeatable deadlock on BlockInfoManager and TorrentBroadcast

2018-02-09 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358458#comment-16358458 ] Eyal Farago commented on SPARK-19870: - [~irashid], I'm afraid I don't have a documentation of which

[jira] [Commented] (SPARK-19870) Repeatable deadlock on BlockInfoManager and TorrentBroadcast

2018-02-08 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356992#comment-16356992 ] Eyal Farago commented on SPARK-19870: - [~irashid], attached a sample executor log > Repeatable

[jira] [Updated] (SPARK-19870) Repeatable deadlock on BlockInfoManager and TorrentBroadcast

2018-02-08 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eyal Farago updated SPARK-19870: Attachment: cs.executor.log > Repeatable deadlock on BlockInfoManager and TorrentBroadcast >

[jira] [Commented] (SPARK-19870) Repeatable deadlock on BlockInfoManager and TorrentBroadcast

2018-02-07 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16355174#comment-16355174 ] Eyal Farago commented on SPARK-19870: - [~irashid], wenth through executors' logs and found no errors.

[jira] [Commented] (SPARK-19870) Repeatable deadlock on BlockInfoManager and TorrentBroadcast

2018-02-06 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353651#comment-16353651 ] Eyal Farago commented on SPARK-19870: - we've also seen something very similar (stack traces attach)

[jira] [Commented] (SPARK-18067) SortMergeJoin adds shuffle if join predicates have non partitioned columns

2018-02-05 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352594#comment-16352594 ] Eyal Farago commented on SPARK-18067: - [~tejasp], this issue + associated pull-request seems to relax

[jira] [Commented] (SPARK-21867) Support async spilling in UnsafeShuffleWriter

2017-12-12 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287461#comment-16287461 ] Eyal Farago commented on SPARK-21867: - [~ericvandenbergfb], looks good few questions though: 1.

[jira] [Commented] (SPARK-22579) BlockManager.getRemoteValues and BlockManager.getRemoteBytes should be implemented using streaming

2017-11-26 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16266102#comment-16266102 ] Eyal Farago commented on SPARK-22579: - [~srowen], I couldn't see a place where this data is stored

[jira] [Commented] (SPARK-22579) BlockManager.getRemoteValues and BlockManager.getRemoteBytes should be implemented using streaming

2017-11-24 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16265045#comment-16265045 ] Eyal Farago commented on SPARK-22579: - [~jerryshao], SPARK-22062 seems to solve the memory footprint

[jira] [Commented] (SPARK-22579) BlockManager.getRemoteValues and BlockManager.getRemoteBytes should be implemented using streaming

2017-11-22 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262120#comment-16262120 ] Eyal Farago commented on SPARK-22579: - CC: [~hvanhovell] (we've discussed this privately),

[jira] [Created] (SPARK-22579) BlockManager.getRemoteValues and BlockManager.getRemoteBytes should be implemented using streaming

2017-11-21 Thread Eyal Farago (JIRA)
Eyal Farago created SPARK-22579: --- Summary: BlockManager.getRemoteValues and BlockManager.getRemoteBytes should be implemented using streaming Key: SPARK-22579 URL: https://issues.apache.org/jira/browse/SPARK-22579

[jira] [Commented] (SPARK-21907) NullPointerException in UnsafeExternalSorter.spill()

2017-09-10 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160366#comment-16160366 ] Eyal Farago commented on SPARK-21907: - opened PR: https://github.com/apache/spark/pull/19181 >

[jira] [Commented] (SPARK-21907) NullPointerException in UnsafeExternalSorter.spill()

2017-09-09 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160062#comment-16160062 ] Eyal Farago commented on SPARK-21907: - [~juliuszsompolski], I've followed the stack trace you've

[jira] [Commented] (SPARK-3151) DiskStore attempts to map any size BlockId without checking MappedByteBuffer limit

2017-08-08 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118347#comment-16118347 ] Eyal Farago commented on SPARK-3151: created PR 18855: https://github.com/apache/spark/pull/18855 >

[jira] [Commented] (SPARK-3151) DiskStore attempts to map any size BlockId without checking MappedByteBuffer limit

2017-07-05 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074392#comment-16074392 ] Eyal Farago commented on SPARK-3151: I just hit this with spark 2.1 when processing a disk persisted

[jira] [Commented] (SPARK-19870) Repeatable deadlock on BlockInfoManager and TorrentBroadcast

2017-04-08 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961954#comment-15961954 ] Eyal Farago commented on SPARK-19870: - [~stevenruppert] logs from the hung executor might shed some

[jira] [Commented] (SPARK-19870) Repeatable deadlock on BlockInfoManager and TorrentBroadcast

2017-04-06 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959169#comment-15959169 ] Eyal Farago commented on SPARK-19870: - Steven, I meant traces produced by logging. Sean, this looks

[jira] [Commented] (SPARK-19870) Repeatable deadlock on BlockInfoManager and TorrentBroadcast

2017-04-06 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958894#comment-15958894 ] Eyal Farago commented on SPARK-19870: - [~stevenruppert] can you attach traces? [~joshrosen], a glance

[jira] [Commented] (SPARK-18736) CreateMap allows non-unique keys

2017-02-02 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851078#comment-15851078 ] Eyal Farago commented on SPARK-18736: - Spark-8601 is making (slow) progress,it's actually in final

[jira] [Commented] (SPARK-18736) CreateMap allows non-unique keys

2016-12-06 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726232#comment-15726232 ] Eyal Farago commented on SPARK-18736: - @shuai Lin, I already have a pr in progress that addresses the

[jira] [Created] (SPARK-18736) [SQL] CreateMap allow non-unique keys

2016-12-05 Thread Eyal Farago (JIRA)
Eyal Farago created SPARK-18736: --- Summary: [SQL] CreateMap allow non-unique keys Key: SPARK-18736 URL: https://issues.apache.org/jira/browse/SPARK-18736 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-14804) Graph vertexRDD/EdgeRDD checkpoint results ClassCastException:

2016-10-10 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561452#comment-15561452 ] Eyal Farago commented on SPARK-14804: - I think this relates to SPARK-12431, is it possible to 'merge'

[jira] [Commented] (SPARK-12431) add local checkpointing to GraphX

2016-10-10 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561437#comment-15561437 ] Eyal Farago commented on SPARK-12431: - I think this heavily relates to SPARK-14804, assuming this is

[jira] [Created] (SPARK-16839) CleanupAliases may leave redundant aliases at end of analysis state

2016-08-01 Thread Eyal Farago (JIRA)
Eyal Farago created SPARK-16839: --- Summary: CleanupAliases may leave redundant aliases at end of analysis state Key: SPARK-16839 URL: https://issues.apache.org/jira/browse/SPARK-16839 Project: Spark

[jira] [Commented] (SPARK-16791) casting structs fails on Timestamp fields (interpreted mode only)

2016-07-29 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398916#comment-15398916 ] Eyal Farago commented on SPARK-16791: - created pull request

[jira] [Created] (SPARK-16791) casting structs fails on Timestamp fields (interpreted mode only)

2016-07-29 Thread Eyal Farago (JIRA)
Eyal Farago created SPARK-16791: --- Summary: casting structs fails on Timestamp fields (interpreted mode only) Key: SPARK-16791 URL: https://issues.apache.org/jira/browse/SPARK-16791 Project: Spark