[jira] [Assigned] (SPARK-13610) Create a Transformer to disassemble vectors in DataFrames

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13610: Assignee: Apache Spark > Create a Transformer to disassemble vectors in DataFrames > -

[jira] [Assigned] (SPARK-13610) Create a Transformer to disassemble vectors in DataFrames

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13610: Assignee: (was: Apache Spark) > Create a Transformer to disassemble vectors in DataFra

[jira] [Commented] (SPARK-13610) Create a Transformer to disassemble vectors in DataFrames

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803878#comment-15803878 ] Apache Spark commented on SPARK-13610: -- User 'leonfl' has created a pull request for

[jira] [Created] (SPARK-19101) Spark Beeline catch a exeception when run command " load data inpath '/data/test/test.csv' overwrite into table db.test partition(area='021')"

2017-01-05 Thread Xiaochen Ouyang (JIRA)
Xiaochen Ouyang created SPARK-19101: --- Summary: Spark Beeline catch a exeception when run command " load data inpath '/data/test/test.csv' overwrite into table db.test partition(area='021')" Key: SPARK-19101 UR

[jira] [Updated] (SPARK-19083) sbin/start-history-server.sh scripts use of $@ without ""

2017-01-05 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-19083: Affects Version/s: 2.1.0 > sbin/start-history-server.sh scripts use of $@ without "" >

[jira] [Commented] (SPARK-19099) Wrong time display on Spark History Server web UI

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803839#comment-15803839 ] Apache Spark commented on SPARK-19099: -- User '351zyf' has created a pull request for

[jira] [Assigned] (SPARK-19083) sbin/start-history-server.sh scripts use of $@ without ""

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19083: Assignee: (was: Apache Spark) > sbin/start-history-server.sh scripts use of $@ without

[jira] [Updated] (SPARK-19099) Wrong time display on Spark History Server web UI

2017-01-05 Thread JohnsonZhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JohnsonZhang updated SPARK-19099: - External issue URL: https://github.com/apache/spark/pull/16485 > Wrong time display on Spark Hist

[jira] [Assigned] (SPARK-19099) Wrong time display on Spark History Server web UI

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19099: Assignee: Apache Spark > Wrong time display on Spark History Server web UI > -

[jira] [Assigned] (SPARK-19099) Wrong time display on Spark History Server web UI

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19099: Assignee: (was: Apache Spark) > Wrong time display on Spark History Server web UI > --

[jira] [Assigned] (SPARK-19083) sbin/start-history-server.sh scripts use of $@ without ""

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19083: Assignee: Apache Spark > sbin/start-history-server.sh scripts use of $@ without "" > -

[jira] [Commented] (SPARK-19083) sbin/start-history-server.sh scripts use of $@ without ""

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803840#comment-15803840 ] Apache Spark commented on SPARK-19083: -- User 'zuotingbing' has created a pull reques

[jira] [Updated] (SPARK-19100) Schedule tasks in descending order of estimated input size / estimated task duration

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-19100: --- Description: Say that you're scheduling a reduce phase and based on the map output sizes you have id

[jira] [Updated] (SPARK-19083) sbin/start-history-server.sh scripts use of $@ without ""

2017-01-05 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-19083: Attachment: (was: 0001-SPARK-19083.patch) > sbin/start-history-server.sh scripts use of $@ with

[jira] [Created] (SPARK-19100) Schedule tasks in descending order of estimated input size / estimated task duration

2017-01-05 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-19100: -- Summary: Schedule tasks in descending order of estimated input size / estimated task duration Key: SPARK-19100 URL: https://issues.apache.org/jira/browse/SPARK-19100 Proj

[jira] [Created] (SPARK-19099) Wrong time display on Spark History Server web UI

2017-01-05 Thread JohnsonZhang (JIRA)
JohnsonZhang created SPARK-19099: Summary: Wrong time display on Spark History Server web UI Key: SPARK-19099 URL: https://issues.apache.org/jira/browse/SPARK-19099 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16815) Dataset[List[T]] leads to ArrayStoreException

2017-01-05 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803811#comment-15803811 ] Wenchen Fan commented on SPARK-16815: - #16240 has been merged, can you try it again t

[jira] [Issue Comment Deleted] (SPARK-16815) Dataset[List[T]] leads to ArrayStoreException

2017-01-05 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-16815: Comment: was deleted (was: #16240 has been merged, can you try it again to see if it's fixed?) > D

[jira] [Commented] (SPARK-16815) Dataset[List[T]] leads to ArrayStoreException

2017-01-05 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803808#comment-15803808 ] Wenchen Fan commented on SPARK-16815: - #16240 has been merged, can you try it again t

[jira] [Resolved] (SPARK-16792) Dataset containing a Case Class with a List type causes a CompileException (converting sequence to list)

2017-01-05 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-16792. - Resolution: Fixed Assignee: Michal Šenkýř Fix Version/s: 2.2.0 > Dataset containi

[jira] [Assigned] (SPARK-18847) PageRank gives incorrect results for graphs with sinks

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18847: Assignee: (was: Apache Spark) > PageRank gives incorrect results for graphs with sinks

[jira] [Commented] (SPARK-18847) PageRank gives incorrect results for graphs with sinks

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803773#comment-15803773 ] Apache Spark commented on SPARK-18847: -- User 'aray' has created a pull request for t

[jira] [Assigned] (SPARK-18847) PageRank gives incorrect results for graphs with sinks

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18847: Assignee: Apache Spark > PageRank gives incorrect results for graphs with sinks >

[jira] [Issue Comment Deleted] (SPARK-19083) sbin/start-history-server.sh scripts use of $@ without ""

2017-01-05 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-19083: Comment: was deleted (was: Hi Dongjoon Hyun. I forked spark from apache/spark several days ago and

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-05 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803754#comment-15803754 ] Saisai Shao commented on SPARK-19090: - "spark.executor.cores" has nothing to do with

[jira] [Commented] (SPARK-19083) sbin/start-history-server.sh scripts use of $@ without ""

2017-01-05 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803750#comment-15803750 ] zuotingbing commented on SPARK-19083: - Hi Dongjoon Hyun. I forked spark from apache/s

[jira] [Commented] (SPARK-19093) Cached tables are not used in SubqueryExpression

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803732#comment-15803732 ] Xiao Li commented on SPARK-19093: - {noformat} /** Replaces segments of the given logica

[jira] [Commented] (SPARK-15880) PREGEL Based Semi-Clustering Algorithm Implementation using Spark GraphX API

2017-01-05 Thread Lee Dongjin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803730#comment-15803730 ] Lee Dongjin commented on SPARK-15880: - Hello. It seems like this issue has been aband

[jira] [Updated] (SPARK-19093) Cached tables are not used in SubqueryExpression

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-19093: --- Summary: Cached tables are not used in SubqueryExpression (was: LeftAntiJoin doesn't seem to resolve

[jira] [Commented] (SPARK-19038) Can't find keytab file when using Hive catalog

2017-01-05 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803722#comment-15803722 ] Saisai Shao commented on SPARK-19038: - Please see the comment I made in Github(https

[jira] [Commented] (SPARK-19093) LeftAntiJoin doesn't seem to resolve cached tables on right side

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803716#comment-15803716 ] Xiao Li commented on SPARK-19093: - Yes. It is not related to LeftAntiJoin. It is in Subqu

[jira] [Commented] (SPARK-18871) New test cases for IN/NOT IN subquery

2017-01-05 Thread kevin yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803712#comment-15803712 ] kevin yu commented on SPARK-18871: -- Thanks. Will submit soon. > New test cases for IN/N

[jira] [Commented] (SPARK-19093) LeftAntiJoin doesn't seem to resolve cached tables on right side

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803709#comment-15803709 ] Josh Rosen commented on SPARK-19093: I'm a bit too busy with other work to tackle thi

[jira] [Updated] (SPARK-19093) LeftAntiJoin doesn't seem to resolve cached tables on right side

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-19093: Affects Version/s: 2.0.2 > LeftAntiJoin doesn't seem to resolve cached tables on right side > -

[jira] [Commented] (SPARK-19093) LeftAntiJoin doesn't seem to resolve cached tables on right side

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803703#comment-15803703 ] Xiao Li commented on SPARK-19093: - Yes. The relation in `PlanExpression` is not replaced

[jira] [Commented] (SPARK-18871) New test cases for IN/NOT IN subquery

2017-01-05 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803681#comment-15803681 ] Reynold Xin commented on SPARK-18871: - Yea let's just reuse this pr, since they are a

[jira] [Commented] (SPARK-17975) EMLDAOptimizer fails with ClassCastException on YARN

2017-01-05 Thread Ilya Matiach (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803673#comment-15803673 ] Ilya Matiach commented on SPARK-17975: -- Thank you for sending the dataset, I'm worki

[jira] [Commented] (SPARK-19083) sbin/start-history-server.sh scripts use of $@ without ""

2017-01-05 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803669#comment-15803669 ] zuotingbing commented on SPARK-19083: - Hi Dongjoon Hyun. OK,I will create a PR. Thank

[jira] [Updated] (SPARK-16848) Check schema validation for user-specified schema in jdbc and table APIs

2017-01-05 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16848: - Description: Currently, Both APIs below: {code} spark.read.schema(StructType(Nil)).jdbc(...) {c

[jira] [Commented] (SPARK-11569) StringIndexer transform fails when column contains nulls

2017-01-05 Thread Ilya Matiach (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803656#comment-15803656 ] Ilya Matiach commented on SPARK-11569: -- @jliwork @srowen are you currently working o

[jira] [Updated] (SPARK-16848) Check schema validation for user-specified schema in jdbc and table APIs

2017-01-05 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16848: - Summary: Check schema validation for user-specified schema in jdbc and table APIs (was: Make jdb

[jira] [Commented] (SPARK-19093) LeftAntiJoin doesn't seem to resolve cached tables on right side

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803647#comment-15803647 ] Xiao Li commented on SPARK-19093: - Let me try to do more investigation in this. > LeftA

[jira] [Commented] (SPARK-19068) Large number of executors causing a ton of ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerExecutorMetricsUpdate(41,

2017-01-05 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803628#comment-15803628 ] JESSE CHEN commented on SPARK-19068: Well, though it does not affect the correctness

[jira] [Commented] (SPARK-19083) sbin/start-history-server.sh scripts use of $@ without ""

2017-01-05 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803626#comment-15803626 ] Dongjoon Hyun commented on SPARK-19083: --- Hi, [~zuo.tingbing9]. Could you create a P

[jira] [Commented] (SPARK-19093) LeftAntiJoin doesn't seem to resolve cached tables on right side

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803623#comment-15803623 ] Josh Rosen commented on SPARK-19093: I'm not sure whether that's the case because I s

[jira] [Updated] (SPARK-19083) sbin/start-history-server.sh scripts use of $@ without ""

2017-01-05 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-19083: -- Fix Version/s: (was: 2.0.2) (was: 2.1.0) > sbin/start-history-server

[jira] [Assigned] (SPARK-19038) Can't find keytab file when using Hive catalog

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19038: Assignee: Apache Spark > Can't find keytab file when using Hive catalog >

[jira] [Commented] (SPARK-19038) Can't find keytab file when using Hive catalog

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803617#comment-15803617 ] Apache Spark commented on SPARK-19038: -- User 'parente' has created a pull request fo

[jira] [Assigned] (SPARK-19038) Can't find keytab file when using Hive catalog

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19038: Assignee: (was: Apache Spark) > Can't find keytab file when using Hive catalog > -

[jira] [Commented] (SPARK-18871) New test cases for IN/NOT IN subquery

2017-01-05 Thread kevin yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803410#comment-15803410 ] kevin yu commented on SPARK-18871: -- [~hvanhovell][~smilegator][~rxin][~nsyca][~dongjoon]

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2017-01-05 Thread Subhankar Dey Sarkar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803407#comment-15803407 ] Subhankar Dey Sarkar commented on SPARK-2620: - I am using spark 2 and just use

[jira] [Commented] (SPARK-19086) Improper scoping of name resolution of columns in HAVING clause

2017-01-05 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803381#comment-15803381 ] Nattavut Sutyanyong commented on SPARK-19086: - I wore a conservative lens whe

[jira] [Commented] (SPARK-19098) Shuffled data leak/size doubling in ConnectedComponents/Pregel iterations

2017-01-05 Thread Steven Ruppert (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803383#comment-15803383 ] Steven Ruppert commented on SPARK-19098: Possibly related is https://issues.apach

[jira] [Updated] (SPARK-19098) Shuffled data leak/size doubling in ConnectedComponents/Pregel iterations

2017-01-05 Thread Steven Ruppert (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Ruppert updated SPARK-19098: --- Attachment: doubling-season.png Screenshot of the spark UI for the job, showing the doubling

[jira] [Commented] (SPARK-19093) LeftAntiJoin doesn't seem to resolve cached tables on right side

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803374#comment-15803374 ] Xiao Li commented on SPARK-19093: - uh, this might be related to the subquery resolution.

[jira] [Created] (SPARK-19098) Shuffled data leak/size doubling in ConnectedComponents/Pregel iterations

2017-01-05 Thread Steven Ruppert (JIRA)
Steven Ruppert created SPARK-19098: -- Summary: Shuffled data leak/size doubling in ConnectedComponents/Pregel iterations Key: SPARK-19098 URL: https://issues.apache.org/jira/browse/SPARK-19098 Project

[jira] [Updated] (SPARK-18871) New test cases for IN/NOT IN subquery

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-18871: Fix Version/s: 2.2.0 > New test cases for IN/NOT IN subquery > - > >

[jira] [Resolved] (SPARK-18871) New test cases for IN/NOT IN subquery

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-18871. - Resolution: Resolved > New test cases for IN/NOT IN subquery > - > >

[jira] [Updated] (SPARK-18871) New test cases for IN/NOT IN subquery

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-18871: Assignee: kevin yu > New test cases for IN/NOT IN subquery > - > >

[jira] [Commented] (SPARK-18871) New test cases for IN/NOT IN subquery

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803352#comment-15803352 ] Xiao Li commented on SPARK-18871: - [~kevinyu98] > New test cases for IN/NOT IN subquery

[jira] [Updated] (SPARK-18871) New test cases for IN/NOT IN subquery

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-18871: Summary: New test cases for IN/NOT IN subquery (was: New test cases for IN subquery) > New test cases for

[jira] [Commented] (SPARK-16180) Task hang on fetching blocks (cached RDD)

2017-01-05 Thread Weizhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803301#comment-15803301 ] Weizhong commented on SPARK-16180: -- Hi, we also meet this issue on Spark 1.6. From the e

[jira] [Commented] (SPARK-18997) Recommended upgrade libthrift to 0.9.3

2017-01-05 Thread meiyoula (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803270#comment-15803270 ] meiyoula commented on SPARK-18997: -- Sorry, I need help > Recommended upgrade libthrift

[jira] [Updated] (SPARK-19083) sbin/start-history-server.sh scripts use of $@ without ""

2017-01-05 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-19083: Attachment: 0001-SPARK-19083.patch > sbin/start-history-server.sh scripts use of $@ without "" > --

[jira] [Created] (SPARK-19097) virtualenv example failed with conda due to ImportError: No module named ruamel.yaml.comments

2017-01-05 Thread Yesha Vora (JIRA)
Yesha Vora created SPARK-19097: -- Summary: virtualenv example failed with conda due to ImportError: No module named ruamel.yaml.comments Key: SPARK-19097 URL: https://issues.apache.org/jira/browse/SPARK-19097

[jira] [Updated] (SPARK-19096) Kmeans.py application fails with virtualenv and due to parse error

2017-01-05 Thread Yesha Vora (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yesha Vora updated SPARK-19096: --- Issue Type: Sub-task (was: Bug) Parent: SPARK-13587 > Kmeans.py application fails with virtu

[jira] [Commented] (SPARK-19096) Kmeans.py application fails with virtualenv and due to parse error

2017-01-05 Thread Yesha Vora (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803223#comment-15803223 ] Yesha Vora commented on SPARK-19096: This is valid bug. Thus reopening it and linking

[jira] [Reopened] (SPARK-19096) Kmeans.py application fails with virtualenv and due to parse error

2017-01-05 Thread Yesha Vora (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yesha Vora reopened SPARK-19096: > Kmeans.py application fails with virtualenv and due to parse error > --

[jira] [Updated] (SPARK-19095) virtualenv example does not work in yarn cluster mode

2017-01-05 Thread Yesha Vora (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yesha Vora updated SPARK-19095: --- Issue Type: Sub-task (was: Bug) Parent: SPARK-13587 > virtualenv example does not work in ya

[jira] [Reopened] (SPARK-19095) virtualenv example does not work in yarn cluster mode

2017-01-05 Thread Yesha Vora (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yesha Vora reopened SPARK-19095: This is valid bug. Thus reopening it and linking it with SPARK-13587. > virtualenv example does not wo

[jira] [Closed] (SPARK-19096) Kmeans.py application fails with virtualenv and due to parse error

2017-01-05 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang closed SPARK-19096. -- Resolution: Invalid Will do it in SPARK-13587 > Kmeans.py application fails with virtualenv and due to

[jira] [Updated] (SPARK-19096) Kmeans.py application fails with virtualenv and due to parse error

2017-01-05 Thread Yesha Vora (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yesha Vora updated SPARK-19096: --- Description: Spark version : 2 Steps: * Install virtualenv ( pip install virtualenv) * create require

[jira] [Updated] (SPARK-19095) virtualenv example does not work in yarn cluster mode

2017-01-05 Thread Yesha Vora (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yesha Vora updated SPARK-19095: --- Description: Spark version: 2 Steps: * install virtualenv on all nodes * create requirement1.txt with

[jira] [Created] (SPARK-19096) Kmeans.py application fails with virtualenv and due to parse error

2017-01-05 Thread Yesha Vora (JIRA)
Yesha Vora created SPARK-19096: -- Summary: Kmeans.py application fails with virtualenv and due to parse error Key: SPARK-19096 URL: https://issues.apache.org/jira/browse/SPARK-19096 Project: Spark

[jira] [Resolved] (SPARK-19095) virtualenv example does not work in yarn cluster mode

2017-01-05 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang resolved SPARK-19095. Resolution: Invalid Will do it in SPARK-13587 > virtualenv example does not work in yarn cluster m

[jira] [Created] (SPARK-19095) virtualenv example does not work in yarn cluster mode

2017-01-05 Thread Yesha Vora (JIRA)
Yesha Vora created SPARK-19095: -- Summary: virtualenv example does not work in yarn cluster mode Key: SPARK-19095 URL: https://issues.apache.org/jira/browse/SPARK-19095 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-18885) unify CREATE TABLE syntax for data source and hive serde tables

2017-01-05 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-18885. -- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16296 [https://github.com/

[jira] [Commented] (SPARK-19091) createDataset(sc.parallelize(x: Seq)) should be equivalent to createDataset(x: Seq)

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803117#comment-15803117 ] Josh Rosen commented on SPARK-19091: Given above comment, maybe my original JIRA here

[jira] [Comment Edited] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase

2017-01-05 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803108#comment-15803108 ] Nattavut Sutyanyong edited comment on SPARK-18874 at 1/6/17 1:18 AM: --

[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase

2017-01-05 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803108#comment-15803108 ] Nattavut Sutyanyong commented on SPARK-18874: - Please try accessing the doc f

[jira] [Commented] (SPARK-19091) createDataset(sc.parallelize(x: Seq)) should be equivalent to createDataset(x: Seq)

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803105#comment-15803105 ] Josh Rosen commented on SPARK-19091: This is a pretty easy change but it does impact

[jira] [Commented] (SPARK-19081) spark sql use HIVE UDF throw exception when return a Map value

2017-01-05 Thread Davy Song (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803099#comment-15803099 ] Davy Song commented on SPARK-19081: --- 17/01/06 09:12:10 INFO SparkContext: Running Spark

[jira] [Commented] (SPARK-19094) Plumb through logging/error messages from the JVM to Jupyter PySpark

2017-01-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803086#comment-15803086 ] holdenk commented on SPARK-19094: - I've got something basic working for this, but thinkin

[jira] [Created] (SPARK-19094) Plumb through logging/error messages from the JVM to Jupyter PySpark

2017-01-05 Thread holdenk (JIRA)
holdenk created SPARK-19094: --- Summary: Plumb through logging/error messages from the JVM to Jupyter PySpark Key: SPARK-19094 URL: https://issues.apache.org/jira/browse/SPARK-19094 Project: Spark I

[jira] [Updated] (SPARK-19091) createDataset(sc.parallelize(x: Seq)) should be equivalent to createDataset(x: Seq)

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-19091: --- Description: It turns out that spark.createDataset(sc.parallelize(x: Seq)) and spark.createaDataSet(x

[jira] [Issue Comment Deleted] (SPARK-19091) createDataset(sc.parallelize(x: Seq)) should be equivalent to createDataset(x: Seq)

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-19091: --- Comment: was deleted (was: Upon closer inspection, I think the right approach here might be to simpl

[jira] [Updated] (SPARK-19091) createDataset(sc.parallelize(x: Seq)) should be equivalent to createDataset(x: Seq)

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-19091: --- Summary: createDataset(sc.parallelize(x: Seq)) should be equivalent to createDataset(x: Seq) (was: I

[jira] [Updated] (SPARK-19091) Implement more accurate statistics for LogicalRDD when child is a mapped ParallelCollectionRDD

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-19091: --- Description: The Catalyst optimizer uses LogicalRDD to represent scans from existing RDDs. In gene

[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase

2017-01-05 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803057#comment-15803057 ] Reynold Xin commented on SPARK-18874: - Thanks. Where is the doc? > First phase: Def

[jira] [Commented] (SPARK-18278) Support native submission of spark jobs to a kubernetes cluster

2017-01-05 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803050#comment-15803050 ] Matt Cheah commented on SPARK-18278: I refactored the scheduler code as a thought exp

[jira] [Created] (SPARK-19093) LeftAntiJoin doesn't seem to resolve cached tables on right side

2017-01-05 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-19093: -- Summary: LeftAntiJoin doesn't seem to resolve cached tables on right side Key: SPARK-19093 URL: https://issues.apache.org/jira/browse/SPARK-19093 Project: Spark

[jira] [Assigned] (SPARK-19092) Save() API of DataFrameWriter should not scan all the saved files

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19092: Assignee: (was: Apache Spark) > Save() API of DataFrameWriter should not scan all the

[jira] [Assigned] (SPARK-19092) Save() API of DataFrameWriter should not scan all the saved files

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19092: Assignee: Apache Spark > Save() API of DataFrameWriter should not scan all the saved files

[jira] [Commented] (SPARK-19092) Save() API of DataFrameWriter should not scan all the saved files

2017-01-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802960#comment-15802960 ] Apache Spark commented on SPARK-19092: -- User 'gatorsmile' has created a pull request

[jira] [Updated] (SPARK-19092) Save() API of DataFrameWriter should not scan all the saved files

2017-01-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-19092: Description: `DataFrameWriter`'s save() API is performing a unnecessary full filesystem scan for the saved

[jira] [Created] (SPARK-19092) Save() API of DataFrameWriter should not scan all the saved files

2017-01-05 Thread Xiao Li (JIRA)
Xiao Li created SPARK-19092: --- Summary: Save() API of DataFrameWriter should not scan all the saved files Key: SPARK-19092 URL: https://issues.apache.org/jira/browse/SPARK-19092 Project: Spark Issu

[jira] [Commented] (SPARK-19091) Implement more accurate statistics for LogicalRDD when child is a mapped ParallelCollectionRDD

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802938#comment-15802938 ] Josh Rosen commented on SPARK-19091: Upon closer inspection, I think the right approa

[jira] [Commented] (SPARK-18630) PySpark ML memory leak

2017-01-05 Thread Sue Ann Hong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802922#comment-15802922 ] Sue Ann Hong commented on SPARK-18630: -- will take a look at this > PySpark ML memor

[jira] [Updated] (SPARK-19091) Implement more accurate statistics for LogicalRDD when child is a mapped ParallelCollectionRDD

2017-01-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-19091: --- Description: The Catalyst optimizer uses LogicalRDD to represent scans from existing RDDs. In genera

[jira] [Commented] (SPARK-19086) Improper scoping of name resolution of columns in HAVING clause

2017-01-05 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802885#comment-15802885 ] Herman van Hovell commented on SPARK-19086: --- We alternate column resolution bet

  1   2   >