[jira] [Commented] (SPARK-25109) spark python should retry reading another datanode if the first one fails to connect

2018-08-14 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580692#comment-16580692 ] Hyukjin Kwon commented on SPARK-25109: -- It should be helpful if we can narrow down this problem. >

[jira] [Commented] (SPARK-25109) spark python should retry reading another datanode if the first one fails to connect

2018-08-14 Thread Yuanbo Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580689#comment-16580689 ] Yuanbo Liu commented on SPARK-25109: [~hyukjin.kwon] Thanks for your comments. Not sure about that,

[jira] [Commented] (SPARK-25120) EventLogListener may miss driver SparkListenerBlockManagerAdded event

2018-08-14 Thread deshanxiao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580670#comment-16580670 ] deshanxiao commented on SPARK-25120: Sure, I find the tab "Executors" in HistorySever sometimes miss

[jira] [Commented] (SPARK-25120) EventLogListener may miss driver SparkListenerBlockManagerAdded event

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580658#comment-16580658 ] Apache Spark commented on SPARK-25120: -- User 'deshanxiao' has created a pull request for this

[jira] [Assigned] (SPARK-25120) EventLogListener may miss driver SparkListenerBlockManagerAdded event

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25120: Assignee: (was: Apache Spark) > EventLogListener may miss driver

[jira] [Assigned] (SPARK-25120) EventLogListener may miss driver SparkListenerBlockManagerAdded event

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25120: Assignee: Apache Spark > EventLogListener may miss driver SparkListenerBlockManagerAdded

[jira] [Commented] (SPARK-25120) EventLogListener may miss driver SparkListenerBlockManagerAdded event

2018-08-14 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580648#comment-16580648 ] Hyukjin Kwon commented on SPARK-25120: -- Can you describe a bit more details? When does it happen?

[jira] [Commented] (SPARK-25109) spark python should retry reading another datanode if the first one fails to connect

2018-08-14 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580647#comment-16580647 ] Hyukjin Kwon commented on SPARK-25109: -- Does the same thing happen in Scala API too? > spark

[jira] [Commented] (SPARK-24771) Upgrade AVRO version from 1.7.7 to 1.8

2018-08-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580644#comment-16580644 ] Wenchen Fan commented on SPARK-24771: - I've sent the email, let's wait for the feedback. > Upgrade

[jira] [Created] (SPARK-25120) EventLogListener may miss driver SparkListenerBlockManagerAdded event

2018-08-14 Thread deshanxiao (JIRA)
deshanxiao created SPARK-25120: -- Summary: EventLogListener may miss driver SparkListenerBlockManagerAdded event Key: SPARK-25120 URL: https://issues.apache.org/jira/browse/SPARK-25120 Project: Spark

[jira] [Updated] (SPARK-25083) remove the type erasure hack in data source scan

2018-08-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-25083: Target Version/s: 3.0.0 > remove the type erasure hack in data source scan >

[jira] [Commented] (SPARK-25083) remove the type erasure hack in data source scan

2018-08-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580638#comment-16580638 ] Wenchen Fan commented on SPARK-25083: - I've set the target version as 3.0. It's a code refactor and

[jira] [Resolved] (SPARK-23874) Upgrade apache/arrow to 0.10.0

2018-08-14 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-23874. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21939

[jira] [Resolved] (SPARK-25115) Eliminate extra memory copy done when a ByteBuf is used that is backed by > 1 ByteBuffer.

2018-08-14 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai resolved SPARK-25115. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 22105

[jira] [Assigned] (SPARK-25115) Eliminate extra memory copy done when a ByteBuf is used that is backed by > 1 ByteBuffer.

2018-08-14 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai reassigned SPARK-25115: --- Assignee: Norman Maurer > Eliminate extra memory copy done when a ByteBuf is used that is

[jira] [Resolved] (SPARK-25113) Add logging to CodeGenerator when any generated method's bytecode size goes above HugeMethodLimit

2018-08-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-25113. - Resolution: Fixed Assignee: Kris Mok Fix Version/s: 2.4.0 > Add logging to

[jira] [Updated] (SPARK-25119) stages in wrong order within job page DAG chart

2018-08-14 Thread Yunjian Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yunjian Zhang updated SPARK-25119: -- Attachment: Screen Shot 2018-08-14 at 3.35.34 PM.png > stages in wrong order within job page

[jira] [Updated] (SPARK-25119) stages in wrong order within job page DAG chart

2018-08-14 Thread Yunjian Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yunjian Zhang updated SPARK-25119: -- Description: {color:#33}multiple stages for same job are shown with wrong order in DAG

[jira] [Created] (SPARK-25119) stages in wrong order within job page DAG chart

2018-08-14 Thread Yunjian Zhang (JIRA)
Yunjian Zhang created SPARK-25119: - Summary: stages in wrong order within job page DAG chart Key: SPARK-25119 URL: https://issues.apache.org/jira/browse/SPARK-25119 Project: Spark Issue

[jira] [Commented] (SPARK-20384) supporting value classes over primitives in DataSets

2018-08-14 Thread Minh Thai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580498#comment-16580498 ] Minh Thai commented on SPARK-20384: --- _(from my comment in SPARK-17368)_ I think the main problem is

[jira] [Commented] (SPARK-25092) Add RewriteExceptAll and RewriteIntersectAll in the list of nonExcludableRules

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580464#comment-16580464 ] Apache Spark commented on SPARK-25092: -- User 'dilipbiswal' has created a pull request for this

[jira] [Updated] (SPARK-25118) Need a solution to persist Spark application console outputs when running in shell/yarn client mode

2018-08-14 Thread Ankur Gupta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Gupta updated SPARK-25118: Description: We execute Spark applications in YARN Client mode a lot of time. When we do so the

[jira] [Created] (SPARK-25118) Need a solution to persist Spark application console outputs when running in shell/yarn client mode

2018-08-14 Thread Ankur Gupta (JIRA)
Ankur Gupta created SPARK-25118: --- Summary: Need a solution to persist Spark application console outputs when running in shell/yarn client mode Key: SPARK-25118 URL: https://issues.apache.org/jira/browse/SPARK-25118

[jira] [Commented] (SPARK-16406) Reference resolution for large number of columns should be faster

2018-08-14 Thread antonkulaga (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580444#comment-16580444 ] antonkulaga commented on SPARK-16406: - Are you going to backport it to 2.3.2 as well? > Reference

[jira] [Commented] (SPARK-23337) withWatermark raises an exception on struct objects

2018-08-14 Thread Aydin Kocas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580440#comment-16580440 ] Aydin Kocas commented on SPARK-23337: - [~marmbrus] Can you give a hint how to do it with

[jira] [Commented] (SPARK-24938) Understand usage of netty's onheap memory use, even with offheap pools

2018-08-14 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580299#comment-16580299 ] Imran Rashid commented on SPARK-24938: -- {quote} So perhaps the fix here is not to use the default

[jira] [Updated] (SPARK-24838) Support uncorrelated IN/EXISTS subqueries for more operators

2018-08-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24838: Target Version/s: 2.4.0 > Support uncorrelated IN/EXISTS subqueries for more operators >

[jira] [Commented] (SPARK-24838) Support uncorrelated IN/EXISTS subqueries for more operators

2018-08-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580298#comment-16580298 ] Xiao Li commented on SPARK-24838: - [~maurits] Any update? Also cc [~liwen] [~hvanhovell] > Support

[jira] [Updated] (SPARK-24838) Support uncorrelated IN/EXISTS subqueries for more operators

2018-08-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24838: Labels: (was: spree) > Support uncorrelated IN/EXISTS subqueries for more operators >

[jira] [Commented] (SPARK-25105) Importing all of pyspark.sql.functions should bring PandasUDFType in as well

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580285#comment-16580285 ] Apache Spark commented on SPARK-25105: -- User 'kevinyu98' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25105) Importing all of pyspark.sql.functions should bring PandasUDFType in as well

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25105: Assignee: (was: Apache Spark) > Importing all of pyspark.sql.functions should bring

[jira] [Assigned] (SPARK-25105) Importing all of pyspark.sql.functions should bring PandasUDFType in as well

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25105: Assignee: Apache Spark > Importing all of pyspark.sql.functions should bring

[jira] [Commented] (SPARK-21375) Add date and timestamp support to ArrowConverters for toPandas() collection

2018-08-14 Thread Eric Wohlstadter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580275#comment-16580275 ] Eric Wohlstadter commented on SPARK-21375: -- [~bryanc] Hi Brian,  I'm using the Spark-Arrow

[jira] [Commented] (SPARK-24938) Understand usage of netty's onheap memory use, even with offheap pools

2018-08-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580260#comment-16580260 ] Marcelo Vanzin commented on SPARK-24938: I think that unless there's a measurable performance

[jira] [Commented] (SPARK-22236) CSV I/O: does not respect RFC 4180

2018-08-14 Thread Joe Pallas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580229#comment-16580229 ] Joe Pallas commented on SPARK-22236: {quote}people with preexisting datasets exported by Spark would

[jira] [Commented] (SPARK-24938) Understand usage of netty's onheap memory use, even with offheap pools

2018-08-14 Thread Nihar Sheth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580225#comment-16580225 ] Nihar Sheth commented on SPARK-24938: - His comment for that change is "The header is a very small

[jira] [Assigned] (SPARK-25043) spark-sql should print the appId and master on startup

2018-08-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned SPARK-25043: - Assignee: Alessandro Bellina > spark-sql should print the appId and master on startup

[jira] [Commented] (SPARK-24938) Understand usage of netty's onheap memory use, even with offheap pools

2018-08-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580213#comment-16580213 ] Marcelo Vanzin commented on SPARK-24938: BTW the change from buffer() to heapBuffer() was made

[jira] [Resolved] (SPARK-25043) spark-sql should print the appId and master on startup

2018-08-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-25043. --- Resolution: Fixed Fix Version/s: 2.4.0 > spark-sql should print the appId and master

[jira] [Commented] (SPARK-25117) Add EXEPT ALL and INTERSECT ALL support in R.

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580208#comment-16580208 ] Apache Spark commented on SPARK-25117: -- User 'dilipbiswal' has created a pull request for this

[jira] [Assigned] (SPARK-25117) Add EXEPT ALL and INTERSECT ALL support in R.

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25117: Assignee: (was: Apache Spark) > Add EXEPT ALL and INTERSECT ALL support in R. >

[jira] [Assigned] (SPARK-25117) Add EXEPT ALL and INTERSECT ALL support in R.

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25117: Assignee: Apache Spark > Add EXEPT ALL and INTERSECT ALL support in R. >

[jira] [Assigned] (SPARK-25116) Fix the "exit code 1" error when terminating Kafka tests

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25116: Assignee: Shixiong Zhu (was: Apache Spark) > Fix the "exit code 1" error when

[jira] [Commented] (SPARK-25116) Fix the "exit code 1" error when terminating Kafka tests

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580202#comment-16580202 ] Apache Spark commented on SPARK-25116: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25116) Fix the "exit code 1" error when terminating Kafka tests

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25116: Assignee: Apache Spark (was: Shixiong Zhu) > Fix the "exit code 1" error when

[jira] [Commented] (SPARK-24938) Understand usage of netty's onheap memory use, even with offheap pools

2018-08-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580195#comment-16580195 ] Marcelo Vanzin commented on SPARK-24938: The line you mention is this, right? {code}

[jira] [Resolved] (SPARK-25088) Rest Server default & doc updates

2018-08-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25088. --- Resolution: Fixed Fix Version/s: 2.4.0 Resolved by 

[jira] [Created] (SPARK-25117) Add EXEPT ALL and INTERSECT ALL support in R.

2018-08-14 Thread Dilip Biswal (JIRA)
Dilip Biswal created SPARK-25117: Summary: Add EXEPT ALL and INTERSECT ALL support in R. Key: SPARK-25117 URL: https://issues.apache.org/jira/browse/SPARK-25117 Project: Spark Issue Type:

[jira] [Created] (SPARK-25116) Fix the "exit code 1" error when terminating Kafka tests

2018-08-14 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-25116: Summary: Fix the "exit code 1" error when terminating Kafka tests Key: SPARK-25116 URL: https://issues.apache.org/jira/browse/SPARK-25116 Project: Spark

[jira] [Updated] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25114: Target Version/s: 2.3.2, 2.4.0 (was: 2.4.0) > RecordBinaryComparator may return wrong result when

[jira] [Commented] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580153#comment-16580153 ] Xiao Li commented on SPARK-25114: - [~jerryshao] Another blocker for 2.3 > RecordBinaryComparator may

[jira] [Commented] (SPARK-24938) Understand usage of netty's onheap memory use, even with offheap pools

2018-08-14 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580140#comment-16580140 ] Imran Rashid commented on SPARK-24938: -- Cool, sounds like the info we need for making this change,

[jira] [Resolved] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-25051. - Resolution: Fixed Assignee: Marco Gaido Fix Version/s: 2.3.3 > where clause on dataset

[jira] [Updated] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25051: Fix Version/s: (was: 2.3.3) 2.3.2 > where clause on dataset gives

[jira] [Commented] (SPARK-24938) Understand usage of netty's onheap memory use, even with offheap pools

2018-08-14 Thread Nihar Sheth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580113#comment-16580113 ] Nihar Sheth commented on SPARK-24938: - Your expectation is correct, the offheap pools remained at 16

[jira] [Assigned] (SPARK-25115) Eliminate extra memory copy done when a ByteBuf is used that is backed by > 1 ByteBuffer.

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25115: Assignee: Apache Spark > Eliminate extra memory copy done when a ByteBuf is used

[jira] [Commented] (SPARK-25115) Eliminate extra memory copy done when a ByteBuf is used that is backed by > 1 ByteBuffer.

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580106#comment-16580106 ] Apache Spark commented on SPARK-25115: -- User 'normanmaurer' has created a pull request for this

[jira] [Assigned] (SPARK-25115) Eliminate extra memory copy done when a ByteBuf is used that is backed by > 1 ByteBuffer.

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25115: Assignee: (was: Apache Spark) > Eliminate extra memory copy done when a ByteBuf

[jira] [Commented] (SPARK-25115) Eliminate extra memory copy done when a ByteBuf is used that is backed by > 1 ByteBuffer.

2018-08-14 Thread Norman Maurer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580087#comment-16580087 ] Norman Maurer commented on SPARK-25115: --- I opened the following PR to fix it:  

[jira] [Created] (SPARK-25115) Eliminate extra memory copy done when a ByteBuf is used that is backed by > 1 ByteBuffer.

2018-08-14 Thread Norman Maurer (JIRA)
Norman Maurer created SPARK-25115: - Summary: Eliminate extra memory copy done when a ByteBuf is used that is backed by > 1 ByteBuffer. Key: SPARK-25115 URL: https://issues.apache.org/jira/browse/SPARK-25115

[jira] [Commented] (SPARK-24787) Events being dropped at an alarming rate due to hsync being slow for eventLogging

2018-08-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580064#comment-16580064 ] Marcelo Vanzin commented on SPARK-24787: In that case it might be good to only use hsync in

[jira] [Commented] (SPARK-24771) Upgrade AVRO version from 1.7.7 to 1.8

2018-08-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580062#comment-16580062 ] Marcelo Vanzin commented on SPARK-24771: Asking is a good start. But I have anecdotal evidence

[jira] [Commented] (SPARK-22236) CSV I/O: does not respect RFC 4180

2018-08-14 Thread Ondrej Kokes (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579974#comment-16579974 ] Ondrej Kokes commented on SPARK-22236: -- Multiline=true by default would cause some slowdown, but

[jira] [Updated] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25051: Component/s: (was: Spark Core) > where clause on dataset gives AnalysisException >

[jira] [Commented] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579967#comment-16579967 ] Xiao Li commented on SPARK-25051: - [~mgaido] This breaks the backport rule. We are unable to remove

[jira] [Commented] (SPARK-24561) User-defined window functions with pandas udf (bounded window)

2018-08-14 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579965#comment-16579965 ] Li Jin commented on SPARK-24561: I am looking into this. Early investigation: 

[jira] [Commented] (SPARK-25083) remove the type erasure hack in data source scan

2018-08-14 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579961#comment-16579961 ] Ryan Blue commented on SPARK-25083: --- [~cloud_fan], what release is this targeting? > remove the type

[jira] [Updated] (SPARK-24721) Failed to use PythonUDF with literal inputs in filter with data sources

2018-08-14 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jin updated SPARK-24721: --- Component/s: SQL > Failed to use PythonUDF with literal inputs in filter with data sources >

[jira] [Updated] (SPARK-24721) Failed to use PythonUDF with literal inputs in filter with data sources

2018-08-14 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jin updated SPARK-24721: --- Issue Type: Bug (was: Sub-task) Parent: (was: SPARK-22216) > Failed to use PythonUDF with

[jira] [Comment Edited] (SPARK-24721) Failed to use PythonUDF with literal inputs in filter with data sources

2018-08-14 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579956#comment-16579956 ] Li Jin edited comment on SPARK-24721 at 8/14/18 3:26 PM: - Updated Jira title to

[jira] [Commented] (SPARK-24721) Failed to use PythonUDF with literal inputs in filter with data sources

2018-08-14 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579956#comment-16579956 ] Li Jin commented on SPARK-24721: Updates Jira title to reflect the actual issue > Failed to use

[jira] [Updated] (SPARK-24721) Failed to use PythonUDF with literal inputs in filter with data sources

2018-08-14 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jin updated SPARK-24721: --- Summary: Failed to use PythonUDF with literal inputs in filter with data sources (was: Failed to call

[jira] [Commented] (SPARK-24941) Add RDDBarrier.coalesce() function

2018-08-14 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579930#comment-16579930 ] Jiang Xingbo commented on SPARK-24941: -- Shall we add something like `spark.default.parallelism`? It

[jira] [Commented] (SPARK-24721) Failed to call PythonUDF whose input is the output of another PythonUDF

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579883#comment-16579883 ] Apache Spark commented on SPARK-24721: -- User 'icexelloss' has created a pull request for this

[jira] [Assigned] (SPARK-24721) Failed to call PythonUDF whose input is the output of another PythonUDF

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24721: Assignee: (was: Apache Spark) > Failed to call PythonUDF whose input is the output

[jira] [Assigned] (SPARK-24721) Failed to call PythonUDF whose input is the output of another PythonUDF

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24721: Assignee: Apache Spark > Failed to call PythonUDF whose input is the output of another

[jira] [Assigned] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25114: Assignee: (was: Apache Spark) > RecordBinaryComparator may return wrong result when

[jira] [Commented] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579880#comment-16579880 ] Apache Spark commented on SPARK-25114: -- User 'jiangxb1987' has created a pull request for this

[jira] [Assigned] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25114: Assignee: Apache Spark > RecordBinaryComparator may return wrong result when subtraction

[jira] [Commented] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579879#comment-16579879 ] Jiang Xingbo commented on SPARK-25114: -- I created https://github.com/apache/spark/pull/22101 for

[jira] [Updated] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-25114: - Labels: correctness (was: ) > RecordBinaryComparator may return wrong result when subtraction

[jira] [Updated] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-25114: - Priority: Blocker (was: Major) > RecordBinaryComparator may return wrong result when

[jira] [Created] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-25114: Summary: RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE Key: SPARK-25114 URL:

[jira] [Commented] (SPARK-24787) Events being dropped at an alarming rate due to hsync being slow for eventLogging

2018-08-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579854#comment-16579854 ] Thomas Graves commented on SPARK-24787: --- Yes it was caused by hsync, hsync has to go to the

[jira] [Commented] (SPARK-24918) Executor Plugin API

2018-08-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579851#comment-16579851 ] Thomas Graves commented on SPARK-24918: --- Personally I like the explicit config on better

[jira] [Updated] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-25051: -- Priority: Blocker (was: Major) > where clause on dataset gives AnalysisException >

[jira] [Resolved] (SPARK-23938) High-order function: map_zip_with(map, map, function) → map

2018-08-14 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-23938. --- Resolution: Fixed Assignee: Marek Novotny Fix Version/s: 2.4.0 Issue

[jira] [Commented] (SPARK-24411) Adding native Java tests for `isInCollection`

2018-08-14 Thread Aleksei Izmalkin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579650#comment-16579650 ] Aleksei Izmalkin commented on SPARK-24411: -- I will work on this issue. > Adding native Java

[jira] [Commented] (SPARK-25113) Add logging to CodeGenerator when any generated method's bytecode size goes above HugeMethodLimit

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579645#comment-16579645 ] Apache Spark commented on SPARK-25113: -- User 'rednaxelafx' has created a pull request for this

[jira] [Assigned] (SPARK-25113) Add logging to CodeGenerator when any generated method's bytecode size goes above HugeMethodLimit

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25113: Assignee: (was: Apache Spark) > Add logging to CodeGenerator when any generated

[jira] [Assigned] (SPARK-25113) Add logging to CodeGenerator when any generated method's bytecode size goes above HugeMethodLimit

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25113: Assignee: Apache Spark > Add logging to CodeGenerator when any generated method's

[jira] [Created] (SPARK-25113) Add logging to CodeGenerator when any generated method's bytecode size goes above HugeMethodLimit

2018-08-14 Thread Kris Mok (JIRA)
Kris Mok created SPARK-25113: Summary: Add logging to CodeGenerator when any generated method's bytecode size goes above HugeMethodLimit Key: SPARK-25113 URL: https://issues.apache.org/jira/browse/SPARK-25113

[jira] [Commented] (SPARK-25102) Write Spark version information to Parquet file footers

2018-08-14 Thread Nikita Poberezkin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579622#comment-16579622 ] Nikita Poberezkin commented on SPARK-25102: --- I will work on this issue > Write Spark version

[jira] [Assigned] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25051: Assignee: Apache Spark > where clause on dataset gives AnalysisException >

[jira] [Commented] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579526#comment-16579526 ] Apache Spark commented on SPARK-25051: -- User 'mgaido91' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25051: Assignee: (was: Apache Spark) > where clause on dataset gives AnalysisException >

[jira] [Commented] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579514#comment-16579514 ] Marco Gaido commented on SPARK-25051: - This was caused by the introduction of AnalysisBarrier. I

[jira] [Commented] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579444#comment-16579444 ] Marco Gaido commented on SPARK-25051: - cc [~jerryshao] shall we set it as a blocker for 2.3.2? >

[jira] [Updated] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25051: Labels: correctness (was: ) > where clause on dataset gives AnalysisException >

[jira] [Commented] (SPARK-25068) High-order function: exists(array, function) → boolean

2018-08-14 Thread Marek Novotny (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579386#comment-16579386 ] Marek Novotny commented on SPARK-25068: --- That's a good point. Thanks for your answer! >

  1   2   >