[jira] [Commented] (SPARK-13209) transitive closure on a dataframe

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814239#comment-15814239 ] Hyukjin Kwon commented on SPARK-13209: -- It seems (at least at the current master) the plans are too

[jira] [Resolved] (SPARK-12940) Partition field in Spark SQL WHERE clause causing Exception

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-12940. -- Resolution: Cannot Reproduce As it can’t be reproduced against master as reported, I am

[jira] [Commented] (SPARK-18641) Show databases NullPointerException while Sentry turned on

2017-01-09 Thread zhangqw (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814210#comment-15814210 ] zhangqw commented on SPARK-18641: - Yes, it seems sentry not fully support spark. I'm now using only HDFS

[jira] [Updated] (SPARK-18641) Show databases NullPointerException while Sentry turned on

2017-01-09 Thread zhangqw (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangqw updated SPARK-18641: Description: I've traced into source code, and it seems that of Sentry not set when spark sql started a

[jira] [Commented] (SPARK-19146) Drop more elements when stageData.taskData.size > retainedTasks to reduce the number of times on call drop

2017-01-09 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814207#comment-15814207 ] Yuming Wang commented on SPARK-19146: - The activated tasks more and more and then

[jira] [Resolved] (SPARK-12911) Cacheing a dataframe causes array comparisons to fail (in filter / where) after 1.6

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-12911. -- Resolution: Cannot Reproduce I can't reproduce this issue at the current master. I am

[jira] [Commented] (SPARK-12809) Spark SQL UDF does not work with struct input parameters

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814182#comment-15814182 ] Hyukjin Kwon commented on SPARK-12809: -- Is this a duplicate of SPARK-12823? > Spark SQL UDF does

[jira] [Resolved] (SPARK-12754) Data type mismatch on two array values when using filter/where

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-12754. -- Resolution: Cannot Reproduce I am resolving this as {{Cannot Reproduce}} because this was

[jira] [Commented] (SPARK-19145) Timestamp to String casting is slowing the query significantly

2017-01-09 Thread gagan taneja (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814165#comment-15814165 ] gagan taneja commented on SPARK-19145: -- i should be able to work on a proposal for the fix >

[jira] [Updated] (SPARK-19146) Drop more elements when stageData.taskData.size > retainedTasks to reduce the number of times on call drop

2017-01-09 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-19146: Attachment: can-not-consume-taskEnd-events.jpg > Drop more elements when stageData.taskData.size >

[jira] [Commented] (SPARK-19146) Drop more elements when stageData.taskData.size > retainedTasks to reduce the number of times on call drop

2017-01-09 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814152#comment-15814152 ] Yuming Wang commented on SPARK-19146: - I will create a PR later > Drop more elements when

[jira] [Created] (SPARK-19146) Drop more elements when stageData.taskData.size > retainedTasks to reduce the number of times on call drop

2017-01-09 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-19146: --- Summary: Drop more elements when stageData.taskData.size > retainedTasks to reduce the number of times on call drop Key: SPARK-19146 URL:

[jira] [Created] (SPARK-19145) Timestamp to String casting is slowing the query significantly

2017-01-09 Thread gagan taneja (JIRA)
gagan taneja created SPARK-19145: Summary: Timestamp to String casting is slowing the query significantly Key: SPARK-19145 URL: https://issues.apache.org/jira/browse/SPARK-19145 Project: Spark

[jira] [Updated] (SPARK-14272) Evaluate GaussianMixtureModel with LogLikelihood

2017-01-09 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-14272: Shepherd: Yanbo Liang > Evaluate GaussianMixtureModel with LogLikelihood >

[jira] [Resolved] (SPARK-12586) Wrong answer with registerTempTable and union sql query

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-12586. -- Resolution: Not A Problem I just ran the codes you attached and it prints as below: {code}

[jira] [Resolved] (SPARK-12484) DataFrame withColumn() does not work in Java

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-12484. -- Resolution: Invalid The API works as expected and they are being tested. I don't think just

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814001#comment-15814001 ] Saisai Shao commented on SPARK-19090: - Are you using SparkConf API to set configuration in

[jira] [Commented] (SPARK-14272) Evaluate GaussianMixtureModel with LogLikelihood

2017-01-09 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813980#comment-15813980 ] Yanbo Liang commented on SPARK-14272: - [~podongfeng] SPARK-17847 has been merged, please move this

[jira] [Resolved] (SPARK-17847) Reduce shuffled data size of GaussianMixture & copy the implementation from mllib to ml

2017-01-09 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-17847. - Resolution: Fixed Fix Version/s: 2.2.0 > Reduce shuffled data size of GaussianMixture &

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813973#comment-15813973 ] Saisai Shao commented on SPARK-19090: - Spark shell is a real spark *application*. The underlying

[jira] [Comment Edited] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813962#comment-15813962 ] nirav patel edited comment on SPARK-19090 at 1/10/17 5:39 AM: -- Oh right, I

[jira] [Commented] (SPARK-19144) Add test for GaussianMixture with distributed decompositions

2017-01-09 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813971#comment-15813971 ] Yanbo Liang commented on SPARK-19144: - cc [~sethah] > Add test for GaussianMixture with distributed

[jira] [Comment Edited] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813962#comment-15813962 ] nirav patel edited comment on SPARK-19090 at 1/10/17 5:38 AM: -- Oh right, I

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813969#comment-15813969 ] nirav patel commented on SPARK-19090: - Also you are just invoking spark-shell here and not submitting

[jira] [Commented] (SPARK-12076) countDistinct behaves inconsistently

2017-01-09 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813966#comment-15813966 ] Herman van Hovell commented on SPARK-12076: --- What is the problem? That the plans are different?

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813962#comment-15813962 ] nirav patel commented on SPARK-19090: - Oh right, I have that set exclusively. I corrected my comment.

[jira] [Resolved] (SPARK-12307) ParquetFormat options should be exposed through the DataFrameReader/Writer options API

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-12307. -- Resolution: Duplicate I believe we can configure this now, for examples,

[jira] [Comment Edited] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813689#comment-15813689 ] nirav patel edited comment on SPARK-19090 at 1/10/17 5:33 AM: -- [~jerryshao]

[jira] [Commented] (SPARK-12264) Add a typeTag or scalaTypeTag method to DataType

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813946#comment-15813946 ] Hyukjin Kwon commented on SPARK-12264: -- (I just simple change the title to {quote} add a typeTag

[jira] [Updated] (SPARK-12264) Add a typeTag or scalaTypeTag method to DataType.

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-12264: - Summary: Add a typeTag or scalaTypeTag method to DataType. (was: Could DataType provide a

[jira] [Updated] (SPARK-12264) Add a typeTag or scalaTypeTag method to DataType

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-12264: - Summary: Add a typeTag or scalaTypeTag method to DataType (was: Add a typeTag or scalaTypeTag

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813941#comment-15813941 ] Saisai Shao commented on SPARK-19090: - {code} ./bin/spark-shell --master yarn-client --conf

[jira] [Created] (SPARK-19144) Add test for GaussianMixture with distributed decompositions

2017-01-09 Thread Yanbo Liang (JIRA)
Yanbo Liang created SPARK-19144: --- Summary: Add test for GaussianMixture with distributed decompositions Key: SPARK-19144 URL: https://issues.apache.org/jira/browse/SPARK-19144 Project: Spark

[jira] [Commented] (SPARK-12076) countDistinct behaves inconsistently

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813923#comment-15813923 ] Hyukjin Kwon commented on SPARK-12076: -- Could I ask to narrow down the problem or self-contained

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813918#comment-15813918 ] nirav patel commented on SPARK-19090: - I am using oozie spark-action to submit job. I set all spark

[jira] [Resolved] (SPARK-9502) ArrayTypes incorrect for DataFrames Java API

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-9502. - Resolution: Not A Problem Now it seems throwing a different exception as below: {code}

[jira] [Commented] (SPARK-9435) Java UDFs don't work with GROUP BY expressions

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813832#comment-15813832 ] Hyukjin Kwon commented on SPARK-9435: - This sill happens in the current master - {code} val df =

[jira] [Created] (SPARK-19143) API in Spark for distributing new delegation tokens (Improve delegation token handling in secure clusters)

2017-01-09 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-19143: - Summary: API in Spark for distributing new delegation tokens (Improve delegation token handling in secure clusters) Key: SPARK-19143 URL:

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813801#comment-15813801 ] Saisai Shao commented on SPARK-19090: - I also tested with Spark 1.5.0, I don't see an issue here, the

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813782#comment-15813782 ] Saisai Shao commented on SPARK-19090: - I tested with Spark 2.0 and latest master (2.2.0-SNAPSHOT),

[jira] [Resolved] (SPARK-5511) [SQL] Possible optimisations for predicate pushdowns from Spark SQL to Parquet

2017-01-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-5511. - Resolution: Invalid Up to my knowledge, for 1., unless we are going to rewrite the Parquet filter

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813723#comment-15813723 ] nirav patel commented on SPARK-19090: - 1.5.2 > Dynamic Resource Allocation not respecting

[jira] [Commented] (SPARK-11569) StringIndexer transform fails when column contains nulls

2017-01-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813712#comment-15813712 ] Joseph K. Bradley commented on SPARK-11569: --- Hi all, I'm sorry for not following up on this,

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813704#comment-15813704 ] Saisai Shao commented on SPARK-19090: - Thanks for your elaboration, would you please tell which

[jira] [Commented] (SPARK-15034) Use the value of spark.sql.warehouse.dir as the warehouse location instead of using hive.metastore.warehouse.dir

2017-01-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813703#comment-15813703 ] nirav patel commented on SPARK-15034: - Is this documented on spark 2.x documents? I don't see it

[jira] [Comment Edited] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813689#comment-15813689 ] nirav patel edited comment on SPARK-19090 at 1/10/17 3:15 AM: -- [~jerryshao]

[jira] [Comment Edited] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813689#comment-15813689 ] nirav patel edited comment on SPARK-19090 at 1/10/17 3:14 AM: -- [~jerryshao]

[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813689#comment-15813689 ] nirav patel commented on SPARK-19090: - [~jerryshao] "spark.executor.cores" is to tell spark AM to

[jira] [Commented] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813632#comment-15813632 ] Nan Zhu commented on SPARK-18905: - [~zsxwing] If you agree on the conclusion above, I will file a PR >

[jira] [Closed] (SPARK-19078) hashingTF,ChiSqSelector,IDF,StandardScaler,PCA transform avoid extra vector conversion

2017-01-09 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng closed SPARK-19078. Resolution: Duplicate > hashingTF,ChiSqSelector,IDF,StandardScaler,PCA transform avoid extra

[jira] [Commented] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813560#comment-15813560 ] Nan Zhu commented on SPARK-18905: - eat my words... when we have queued up batches, we do need

[jira] [Commented] (SPARK-18959) invalid resource statistics for standalone cluster

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813510#comment-15813510 ] Apache Spark commented on SPARK-18959: -- User 'hustfxj' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813459#comment-15813459 ] Nan Zhu edited comment on SPARK-18905 at 1/10/17 1:16 AM: -- yeah, but the

[jira] [Commented] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813459#comment-15813459 ] Nan Zhu commented on SPARK-18905: - yeah, but the downTime including all batches from "checkpoint time" to

[jira] [Commented] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813455#comment-15813455 ] Shixiong Zhu commented on SPARK-18905: -- [~CodingCat] I think `pendingTime` is the jobs that have

[jira] [Comment Edited] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813434#comment-15813434 ] Nan Zhu edited comment on SPARK-18905 at 1/10/17 1:05 AM: -- Hi, [~zsxwing]

[jira] [Commented] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813434#comment-15813434 ] Nan Zhu commented on SPARK-18905: - Hi, [~zsxwing] Thanks for the reply, After testing in our

[jira] [Commented] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813405#comment-15813405 ] Shixiong Zhu commented on SPARK-18905: -- Sorry for the late reply. Yeah, good catch. However, even if

[jira] [Commented] (SPARK-19110) DistributedLDAModel returns different logPrior for original and loaded model

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813315#comment-15813315 ] Apache Spark commented on SPARK-19110: -- User 'wangmiao1981' has created a pull request for this

[jira] [Assigned] (SPARK-19142) spark.kmeans should take seed, initSteps, and tol as parameters

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19142: Assignee: (was: Apache Spark) > spark.kmeans should take seed, initSteps, and tol as

[jira] [Assigned] (SPARK-19142) spark.kmeans should take seed, initSteps, and tol as parameters

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19142: Assignee: Apache Spark > spark.kmeans should take seed, initSteps, and tol as parameters

[jira] [Commented] (SPARK-19142) spark.kmeans should take seed, initSteps, and tol as parameters

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813258#comment-15813258 ] Apache Spark commented on SPARK-19142: -- User 'wangmiao1981' has created a pull request for this

[jira] [Created] (SPARK-19142) spark.kmeans should take seed, initSteps, and tol as parameters

2017-01-09 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19142: - Summary: spark.kmeans should take seed, initSteps, and tol as parameters Key: SPARK-19142 URL: https://issues.apache.org/jira/browse/SPARK-19142 Project: Spark

[jira] [Updated] (SPARK-19137) Garbage left in source tree after SQL tests are run

2017-01-09 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-19137: -- Component/s: Structured Streaming > Garbage left in source tree after SQL tests are run >

[jira] [Assigned] (SPARK-19137) Garbage left in source tree after SQL tests are run

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19137: Assignee: Apache Spark > Garbage left in source tree after SQL tests are run >

[jira] [Assigned] (SPARK-19137) Garbage left in source tree after SQL tests are run

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19137: Assignee: (was: Apache Spark) > Garbage left in source tree after SQL tests are run >

[jira] [Commented] (SPARK-19137) Garbage left in source tree after SQL tests are run

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813230#comment-15813230 ] Apache Spark commented on SPARK-19137: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2017-01-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813205#comment-15813205 ] Shixiong Zhu commented on SPARK-17463: -- [~sunil.rangwani] could you have a simple reproducer? I ran

[jira] [Commented] (SPARK-19137) Garbage left in source tree after SQL tests are run

2017-01-09 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813192#comment-15813192 ] Dongjoon Hyun commented on SPARK-19137: --- Hi, [~vanzin]. I'll make a PR for this. > Garbage left in

[jira] [Resolved] (SPARK-18866) Codegen fails with cryptic error if regexp_replace() output column is not aliased

2017-01-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-18866. Resolution: Duplicate Fix Version/s: 2.2.0 2.1.1 > Codegen fails with

[jira] [Commented] (SPARK-18866) Codegen fails with cryptic error if regexp_replace() output column is not aliased

2017-01-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813172#comment-15813172 ] Josh Rosen commented on SPARK-18866: Yep, that's it. This should be fixed by Burak's patch. >

[jira] [Resolved] (SPARK-18952) regex strings not properly escaped in codegen for aggregations

2017-01-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-18952. Resolution: Fixed Assignee: Burak Yavuz Fix Version/s: 2.2.0

[jira] [Resolved] (SPARK-19138) Python: new HiveContext will use a stopped SparkContext

2017-01-09 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved SPARK-19138. --- Resolution: Duplicate > Python: new HiveContext will use a stopped SparkContext >

[jira] [Updated] (SPARK-19141) VectorAssembler metadata causing memory issues

2017-01-09 Thread Antonia Oprescu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antonia Oprescu updated SPARK-19141: Description: VectorAssembler produces unnecessary metadata that overflows the Java heap in

[jira] [Created] (SPARK-19141) VectorAssembler metadata causing memory issues

2017-01-09 Thread Antonia Oprescu (JIRA)
Antonia Oprescu created SPARK-19141: --- Summary: VectorAssembler metadata causing memory issues Key: SPARK-19141 URL: https://issues.apache.org/jira/browse/SPARK-19141 Project: Spark Issue

[jira] [Assigned] (SPARK-19138) Python: new HiveContext will use a stopped SparkContext

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19138: Assignee: (was: Apache Spark) > Python: new HiveContext will use a stopped

[jira] [Assigned] (SPARK-19139) AES-based authentication mechanism for Spark

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19139: Assignee: Apache Spark > AES-based authentication mechanism for Spark >

[jira] [Assigned] (SPARK-19138) Python: new HiveContext will use a stopped SparkContext

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19138: Assignee: Apache Spark > Python: new HiveContext will use a stopped SparkContext >

[jira] [Assigned] (SPARK-19139) AES-based authentication mechanism for Spark

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19139: Assignee: (was: Apache Spark) > AES-based authentication mechanism for Spark >

[jira] [Commented] (SPARK-19139) AES-based authentication mechanism for Spark

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812988#comment-15812988 ] Apache Spark commented on SPARK-19139: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19140) Allow update mode for non-aggregation streaming queries

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19140: Assignee: Apache Spark (was: Shixiong Zhu) > Allow update mode for non-aggregation

[jira] [Assigned] (SPARK-19140) Allow update mode for non-aggregation streaming queries

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19140: Assignee: Shixiong Zhu (was: Apache Spark) > Allow update mode for non-aggregation

[jira] [Commented] (SPARK-19140) Allow update mode for non-aggregation streaming queries

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812987#comment-15812987 ] Apache Spark commented on SPARK-19140: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-19138) Python: new HiveContext will use a stopped SparkContext

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812984#comment-15812984 ] Apache Spark commented on SPARK-19138: -- User 'rdblue' has created a pull request for this issue:

[jira] [Created] (SPARK-19140) Allow update mode for non-aggregation streaming queries

2017-01-09 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-19140: Summary: Allow update mode for non-aggregation streaming queries Key: SPARK-19140 URL: https://issues.apache.org/jira/browse/SPARK-19140 Project: Spark

[jira] [Created] (SPARK-19139) AES-based authentication mechanism for Spark

2017-01-09 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-19139: -- Summary: AES-based authentication mechanism for Spark Key: SPARK-19139 URL: https://issues.apache.org/jira/browse/SPARK-19139 Project: Spark Issue Type:

[jira] [Commented] (SPARK-19137) Garbage left in source tree after SQL tests are run

2017-01-09 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812915#comment-15812915 ] Dongjoon Hyun commented on SPARK-19137: --- +1 > Garbage left in source tree after SQL tests are run

[jira] [Created] (SPARK-19138) Python: new HiveContext will use a stopped SparkContext

2017-01-09 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-19138: - Summary: Python: new HiveContext will use a stopped SparkContext Key: SPARK-19138 URL: https://issues.apache.org/jira/browse/SPARK-19138 Project: Spark Issue

[jira] [Created] (SPARK-19137) Garbage left in source tree after SQL tests are run

2017-01-09 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-19137: -- Summary: Garbage left in source tree after SQL tests are run Key: SPARK-19137 URL: https://issues.apache.org/jira/browse/SPARK-19137 Project: Spark

[jira] [Commented] (SPARK-18113) Sending AskPermissionToCommitOutput failed, driver enter into task deadloop

2017-01-09 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812819#comment-15812819 ] Andrew Ash commented on SPARK-18113: I've done some more diagnosis on an example I saw, and think

[jira] [Commented] (SPARK-3877) The exit code of spark-submit is still 0 when an yarn application fails

2017-01-09 Thread Joshua Caplan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812752#comment-15812752 ] Joshua Caplan commented on SPARK-3877: -- I think you have created a race condition with this fix which

[jira] [Updated] (SPARK-19123) KeyProviderException when reading Azure Blobs from Apache Spark

2017-01-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-19123: -- Target Version/s: (was: 2.0.0) Labels: (was: newbie) > KeyProviderException when

[jira] [Commented] (SPARK-18952) regex strings not properly escaped in codegen for aggregations

2017-01-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812702#comment-15812702 ] Apache Spark commented on SPARK-18952: -- User 'brkyvz' has created a pull request for this issue:

[jira] [Commented] (SPARK-19125) Streaming Duration by Count

2017-01-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812690#comment-15812690 ] Sean Owen commented on SPARK-19125: --- Yes, I don't think a distributed system, even, is a great

[jira] [Resolved] (SPARK-19020) Cardinality estimation of aggregate operator

2017-01-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-19020. - Resolution: Fixed Assignee: Zhenhua Wang Fix Version/s: 2.2.0 > Cardinality

[jira] [Commented] (SPARK-18917) Dataframe - Time Out Issues / Taking long time in append mode on object stores

2017-01-09 Thread Anbu Cheeralan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812608#comment-15812608 ] Anbu Cheeralan commented on SPARK-18917: I agree the Hadoop fix will reduce the recursive calls.

[jira] [Updated] (SPARK-18665) Spark ThriftServer jobs where are canceled are still “STARTED”

2017-01-09 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-18665: -- Affects Version/s: 2.1.0 > Spark ThriftServer jobs where are canceled are still “STARTED” >

[jira] [Updated] (SPARK-18665) Spark ThriftServer jobs where are canceled are still “STARTED”

2017-01-09 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-18665: -- Affects Version/s: 2.0.2 > Spark ThriftServer jobs where are canceled are still “STARTED” >

[jira] [Commented] (SPARK-13450) SortMergeJoin will OOM when join rows have lot of same keys

2017-01-09 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812550#comment-15812550 ] Zhan Zhang commented on SPARK-13450: ExternalAppendOnlyMap estimate the size of the data saved. In

[jira] [Comment Edited] (SPARK-13450) SortMergeJoin will OOM when join rows have lot of same keys

2017-01-09 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812508#comment-15812508 ] Tejas Patil edited comment on SPARK-13450 at 1/9/17 6:48 PM: - I have seen

  1   2   >