[jira] [Updated] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-07 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-17835: Description: SPARK-14077 copied the {{NaiveBayes}} implementation from mllib to ml and left mllib

[jira] [Assigned] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17835: Assignee: Apache Spark > Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

[jira] [Commented] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557405#comment-15557405 ] Apache Spark commented on SPARK-17835: -- User 'yanboliang' has created a pull request

[jira] [Assigned] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17835: Assignee: (was: Apache Spark) > Optimize NaiveBayes mllib wrapper to eliminate extra p

[jira] [Updated] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-07 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-17835: Description: SPARK-14077 copied the {{NaiveBayes}} implementation from mllib to ml and left mllib

[jira] [Created] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-07 Thread Yanbo Liang (JIRA)
Yanbo Liang created SPARK-17835: --- Summary: Optimize NaiveBayes mllib wrapper to eliminate extra pass on data Key: SPARK-17835 URL: https://issues.apache.org/jira/browse/SPARK-17835 Project: Spark

[jira] [Updated] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-07 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-17835: Issue Type: Improvement (was: Bug) > Optimize NaiveBayes mllib wrapper to eliminate extra pass on

[jira] [Updated] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-07 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-17835: Description: SPARK-14077 copied the {{NaiveBayes}} implementation from mllib to ml and left ml as

[jira] [Commented] (SPARK-10502) tidy up the exception message text to be less verbose/"User friendly"

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557353#comment-15557353 ] Xiao Li commented on SPARK-10502: - In 2.0, we introduced a new Parser. Thus, this becomes

[jira] [Closed] (SPARK-10502) tidy up the exception message text to be less verbose/"User friendly"

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10502. --- Resolution: Won't Fix > tidy up the exception message text to be less verbose/"User friendly" > -

[jira] [Closed] (SPARK-10318) Getting issue in spark connectivity with cassandra

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10318. --- Resolution: Fixed > Getting issue in spark connectivity with cassandra >

[jira] [Comment Edited] (SPARK-10221) RowReaderFactory does not work with blobs

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557326#comment-15557326 ] Xiao Li edited comment on SPARK-10221 at 10/8/16 6:09 AM: -- This

[jira] [Closed] (SPARK-10221) RowReaderFactory does not work with blobs

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10221. --- Resolution: Won't Fix > RowReaderFactory does not work with blobs > -

[jira] [Commented] (SPARK-10221) RowReaderFactory does not work with blobs

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557326#comment-15557326 ] Xiao Li commented on SPARK-10221: - This should be the bug in the connector. Thus, close i

[jira] [Closed] (SPARK-8377) Identifiers caseness information should be available at any time

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-8377. -- Resolution: Fixed Please reopen it, if you still hit this issue. Thanks! > Identifiers caseness information sho

[jira] [Commented] (SPARK-8377) Identifiers caseness information should be available at any time

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557316#comment-15557316 ] Xiao Li commented on SPARK-8377: We can use backticks when users need to enforce the case.

[jira] [Resolved] (SPARK-9685) "Unsupported dataType: char(X)" in Hive

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-9685. Resolution: Fixed > "Unsupported dataType: char(X)" in Hive > --- > >

[jira] [Commented] (SPARK-9685) "Unsupported dataType: char(X)" in Hive

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557308#comment-15557308 ] Xiao Li commented on SPARK-9685: This has been resolved by another PR (SPARK-11628). Close

[jira] [Closed] (SPARK-10174) refactor out project, filter, ordering generator from SparkPlan

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10174. --- Resolution: Won't Fix Close it since the PR has been closed. > refactor out project, filter, ordering genera

[jira] [Closed] (SPARK-10154) remove the no-longer-necessary CatalystScan

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-10154. --- Resolution: Won't Fix Keep it based on the PR discussion > remove the no-longer-necessary CatalystScan > ---

[jira] [Closed] (SPARK-8436) Inconsistent behavior when converting a Timestamp column to Integer/Long and then convert back to Timestamp

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-8436. -- Resolution: Won't Fix > Inconsistent behavior when converting a Timestamp column to Integer/Long and > then con

[jira] [Commented] (SPARK-8436) Inconsistent behavior when converting a Timestamp column to Integer/Long and then convert back to Timestamp

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557282#comment-15557282 ] Xiao Li commented on SPARK-8436: After reading the PR description, this is not valid now.

[jira] [Commented] (SPARK-17825) Expose log likelihood of EM algorithm in mllib

2016-10-07 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557281#comment-15557281 ] Yanbo Liang commented on SPARK-17825: - Sure. You can definitely contribute on this is

[jira] [Closed] (SPARK-14229) PySpark DataFrame.rdd's can't be saved to an arbitrary Hadoop OutputFormat

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-14229. --- Resolution: Won't Fix I don't think this is really a bug - if you want to save from dataframes there is the

[jira] [Commented] (SPARK-13585) addPyFile behavior change between 1.6 and before

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557248#comment-15557248 ] holdenk commented on SPARK-13585: - What is the use case for overwriting the old pyFile? T

[jira] [Commented] (SPARK-13606) Error from python worker: /usr/local/bin/python2.7: undefined symbol: _PyCodec_LookupTextEncoding

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557244#comment-15557244 ] holdenk commented on SPARK-13606: - Are you still experiencing this? > Error from python

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557242#comment-15557242 ] holdenk commented on SPARK-13534: - For people following along arrow is in the middle of v

[jira] [Closed] (SPARK-13368) PySpark JavaModel fails to extract params from Spark side automatically

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-13368. --- Resolution: Fixed > PySpark JavaModel fails to extract params from Spark side automatically > ---

[jira] [Commented] (SPARK-13368) PySpark JavaModel fails to extract params from Spark side automatically

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557239#comment-15557239 ] holdenk commented on SPARK-13368: - It seems that we don't have this in the example anymor

[jira] [Updated] (SPARK-9965) Scala, Python SQLContext input methods' deprecation statuses do not match

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-9965: --- Component/s: (was: SQL) > Scala, Python SQLContext input methods' deprecation statuses do not match >

[jira] [Commented] (SPARK-9938) Constant folding in binaryComparison

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557235#comment-15557235 ] Xiao Li commented on SPARK-9938: After reading the PR discussion, I think we can first clo

[jira] [Closed] (SPARK-9938) Constant folding in binaryComparison

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-9938. -- Resolution: Won't Fix > Constant folding in binaryComparison > > >

[jira] [Commented] (SPARK-13303) Spark fails with pandas import error when pandas is not explicitly imported by user

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557230#comment-15557230 ] holdenk commented on SPARK-13303: - What about if we added a requirements file? We have on

[jira] [Commented] (SPARK-11722) Rdds could be different between orginal one and save-out-then-read-in one

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557226#comment-15557226 ] holdenk commented on SPARK-11722: - Is this still an issue you are experiencing and if so

[jira] [Commented] (SPARK-12776) Implement Python API for Datasets

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557224#comment-15557224 ] holdenk commented on SPARK-12776: - Just re-opening discussion here - the migration to dat

[jira] [Commented] (SPARK-9842) Push down Spark SQL UDF to datasource UDF

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557222#comment-15557222 ] Xiao Li commented on SPARK-9842: cc [~tsuresh] > Push down Spark SQL UDF to datasource UD

[jira] [Commented] (SPARK-12100) bug in spark/python/pyspark/rdd.py portable_hash()

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557219#comment-15557219 ] holdenk commented on SPARK-12100: - Just noting related progress in https://github.com/apa

[jira] [Commented] (SPARK-11874) DistributedCache for PySpark

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557217#comment-15557217 ] holdenk commented on SPARK-11874: - I think this is not intended to be supported, although

[jira] [Closed] (SPARK-12774) DataFrame.mapPartitions apply function operates on Pandas DataFrame instead of a generator or rows

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-12774. --- Resolution: Won't Fix In some ways yes avoiding unecessary iteration can be good, but allowing Spark to spil

[jira] [Commented] (SPARK-9732) remove the unsafe -> safe conversion

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557203#comment-15557203 ] Xiao Li commented on SPARK-9732: Should we close this? > remove the unsafe -> safe conver

[jira] [Commented] (SPARK-11571) Twitter Api for PySpark

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557205#comment-15557205 ] holdenk commented on SPARK-11571: - Is there anything you are looking to do with this API?

[jira] [Commented] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557195#comment-15557195 ] holdenk commented on SPARK-3600: Is this something we still want to work on or does `Datas

[jira] [Commented] (SPARK-3513) Provide a utility for running a function once on each executor

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557193#comment-15557193 ] holdenk commented on SPARK-3513: This seems closely related to SPARK-650 and SPARK-636 as

[jira] [Closed] (SPARK-9764) Spark SQL uses table metadata inconsistently

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-9764. -- Resolution: Fixed > Spark SQL uses table metadata inconsistently >

[jira] [Commented] (SPARK-9764) Spark SQL uses table metadata inconsistently

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557188#comment-15557188 ] Xiao Li commented on SPARK-9764: This should be resolved in the latest branch. Please chec

[jira] [Comment Edited] (SPARK-9764) Spark SQL uses table metadata inconsistently

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557188#comment-15557188 ] Xiao Li edited comment on SPARK-9764 at 10/8/16 4:38 AM: - This sho

[jira] [Commented] (SPARK-3348) Support user-defined SparkListeners properly

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557186#comment-15557186 ] holdenk commented on SPARK-3348: Is there still interest in seeing this happen? Should we

[jira] [Commented] (SPARK-9342) Spark SQL views don't work

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557184#comment-15557184 ] Xiao Li commented on SPARK-9342: This should be resolved since Spark 2.0. Please check it.

[jira] [Resolved] (SPARK-9342) Spark SQL views don't work

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-9342. Resolution: Fixed > Spark SQL views don't work > -- > > Key: SPARK-9

[jira] [Closed] (SPARK-3312) Add a groupByKey which returns a special GroupBy object like in pandas

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-3312. -- Resolution: Won't Fix > Add a groupByKey which returns a special GroupBy object like in pandas > ---

[jira] [Commented] (SPARK-17825) Expose log likelihood of EM algorithm in mllib

2016-10-07 Thread Lei Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557181#comment-15557181 ] Lei Wang commented on SPARK-17825: -- That's good. May I take part in this job? By the way

[jira] [Commented] (SPARK-3312) Add a groupByKey which returns a special GroupBy object like in pandas

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557182#comment-15557182 ] holdenk commented on SPARK-3312: I'm going to go ahead and close this, now that `Datasets`

[jira] [Commented] (SPARK-3132) Avoid serialization for Array[Byte] in TorrentBroadcast

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557178#comment-15557178 ] holdenk commented on SPARK-3132: Is there any progress on this or would it be ok for me to

[jira] [Closed] (SPARK-8957) Backport Hive 1.X support to Branch 1.4

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-8957. -- Resolution: Won't Fix Thanks. Close it now > Backport Hive 1.X support to Branch 1.4 >

[jira] [Commented] (SPARK-2722) Mechanism for escaping spark configs is not consistent

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557158#comment-15557158 ] holdenk commented on SPARK-2722: I think at this point trying to change the escaping of th

[jira] [Closed] (SPARK-1792) Missing Spark-Shell Configure Options

2016-10-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-1792. -- Resolution: Fixed > Missing Spark-Shell Configure Options > - > >

[jira] [Commented] (SPARK-2032) Add an RDD.samplePartitions method for partition-level sampling

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557153#comment-15557153 ] holdenk commented on SPARK-2032: I'm assuming since there hasn't been any activity for awh

[jira] [Commented] (SPARK-1865) Improve behavior of cleanup of disk state

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557149#comment-15557149 ] holdenk commented on SPARK-1865: So ALS specifically has a work around for this with clean

[jira] [Commented] (SPARK-1792) Missing Spark-Shell Configure Options

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557146#comment-15557146 ] holdenk commented on SPARK-1792: It feels like we've already got a pretty good mechanism f

[jira] [Closed] (SPARK-9309) Support DecimalType and TimestampType in UnsafeRowConverter

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-9309. -- Resolution: Won't Fix Based on the above discussion, it sounds this JIRA is not needed. Close it now > Support

[jira] [Updated] (SPARK-17825) Expose log likelihood of EM algorithm in mllib

2016-10-07 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-17825: Component/s: (was: MLlib) ML > Expose log likelihood of EM algorithm in mllib

[jira] [Commented] (SPARK-17825) Expose log likelihood of EM algorithm in mllib

2016-10-07 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557137#comment-15557137 ] Yanbo Liang commented on SPARK-17825: - [~is03wlei] This task depends on copying the G

[jira] [Commented] (SPARK-8957) Backport Hive 1.X support to Branch 1.4

2016-10-07 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557139#comment-15557139 ] Michael Armbrust commented on SPARK-8957: - Yeah, close it. > Backport Hive 1.X s

[jira] [Commented] (SPARK-1762) Add functionality to pin RDDs in cache

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557133#comment-15557133 ] holdenk commented on SPARK-1762: Is this something we are still interested in? I could see

[jira] [Updated] (SPARK-6802) User Defined Aggregate Function Refactoring

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-6802: --- Component/s: (was: SQL) > User Defined Aggregate Function Refactoring > --

[jira] [Updated] (SPARK-9189) Takes locality and the sum of partition length into account when partition is instance of HadoopPartition in operator coalesce

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-9189: --- Component/s: (was: SQL) Spark Core > Takes locality and the sum of partition length into

[jira] [Closed] (SPARK-9194) fix case-insensitive bug for aggregation expression which is not PartialAggregate

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-9194. -- Resolution: Won't Fix > fix case-insensitive bug for aggregation expression which is not > PartialAggregate > -

[jira] [Resolved] (SPARK-8624) DataFrameReader doesn't respect MERGE_SCHEMA setting for Parquet

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-8624. Resolution: Won't Fix > DataFrameReader doesn't respect MERGE_SCHEMA setting for Parquet > -

[jira] [Commented] (SPARK-8957) Backport Hive 1.X support to Branch 1.4

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557099#comment-15557099 ] Xiao Li commented on SPARK-8957: Is this still needed? Maybe we should close it? > Backpo

[jira] [Commented] (SPARK-10161) Support Pyspark shell over Mesos Cluster Mode

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557069#comment-15557069 ] holdenk commented on SPARK-10161: - That being said - I'm not sure I see the value of this

[jira] [Commented] (SPARK-10161) Support Pyspark shell over Mesos Cluster Mode

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557068#comment-15557068 ] holdenk commented on SPARK-10161: - I think this is an issue accross cluster modes, maybe

[jira] [Commented] (SPARK-17219) QuantileDiscretizer does strange things with NaN values

2016-10-07 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557034#comment-15557034 ] Vincent commented on SPARK-17219: - [~josephkb] [~srowen] [~timhunter] let me know what I

[jira] [Commented] (SPARK-17219) QuantileDiscretizer does strange things with NaN values

2016-10-07 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557021#comment-15557021 ] Vincent commented on SPARK-17219: - in this PR(https://github.com/apache/spark/pull/14858)

[jira] [Updated] (SPARK-9487) Use the same num. worker threads in Scala/Python unit tests

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-9487: --- Labels: starter (was: ) > Use the same num. worker threads in Scala/Python unit tests > -

[jira] [Commented] (SPARK-9487) Use the same num. worker threads in Scala/Python unit tests

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557018#comment-15557018 ] holdenk commented on SPARK-9487: This will maybe break some tests in the process but it wo

[jira] [Commented] (SPARK-17782) Kafka 010 test is flaky

2016-10-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557013#comment-15557013 ] Apache Spark commented on SPARK-17782: -- User 'koeninger' has created a pull request

[jira] [Closed] (SPARK-8760) allow moving and symlinking binaries

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-8760. -- Resolution: Fixed This is a "partially fixed" but I think fixed is a close enough description. We don't use rea

[jira] [Closed] (SPARK-8757) Check missing and add user guide for MLlib Python API

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-8757. -- Resolution: Fixed All sub issues fixed, and well past 1.5 release. > Check missing and add user guide for MLlib

[jira] [Commented] (SPARK-8842) Spark SQL - Insert into table Issue

2016-10-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556986#comment-15556986 ] Xiao Li commented on SPARK-8842: Could you retry it in the latest master branch? Thanks!

[jira] [Commented] (SPARK-11272) Support importing and exporting event logs from HistoryServer web portal

2016-10-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556866#comment-15556866 ] Apache Spark commented on SPARK-11272: -- User 'ajbozarth' has created a pull request

[jira] [Commented] (SPARK-14503) spark.ml API for FPGrowth

2016-10-07 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556862#comment-15556862 ] yuhao yang commented on SPARK-14503: Yes, let me just send what I got. > spark.ml AP

[jira] [Assigned] (SPARK-17819) Specified database in JDBC URL is ignored when connecting to thriftserver

2016-10-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17819: Assignee: (was: Apache Spark) > Specified database in JDBC URL is ignored when connect

[jira] [Commented] (SPARK-17819) Specified database in JDBC URL is ignored when connecting to thriftserver

2016-10-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556857#comment-15556857 ] Apache Spark commented on SPARK-17819: -- User 'dongjoon-hyun' has created a pull requ

[jira] [Assigned] (SPARK-17819) Specified database in JDBC URL is ignored when connecting to thriftserver

2016-10-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17819: Assignee: Apache Spark > Specified database in JDBC URL is ignored when connecting to thri

[jira] [Closed] (SPARK-8719) Adding Python support for 1-sample, 2-sided Kolmogorov Smirnov Test

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-8719. -- Resolution: Duplicate > Adding Python support for 1-sample, 2-sided Kolmogorov Smirnov Test > --

[jira] [Updated] (SPARK-8605) Exclude files in StreamingContext. textFileStream(directory)

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-8605: --- Component/s: (was: PySpark) Streaming > Exclude files in StreamingContext. textFileStream

[jira] [Commented] (SPARK-8605) Exclude files in StreamingContext. textFileStream(directory)

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556785#comment-15556785 ] holdenk commented on SPARK-8605: This is semi-documented (namely only atomic moves are sup

[jira] [Commented] (SPARK-7177) Create standard way to wrap Spark CLI scripts for external projects

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556780#comment-15556780 ] holdenk commented on SPARK-7177: I've run into similar challenges when working on Sparklin

[jira] [Commented] (SPARK-7941) Cache Cleanup Failure when job is killed by Spark

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556775#comment-15556775 ] holdenk commented on SPARK-7941: Are you still experiencing this issue [~cqnguyen] or woul

[jira] [Updated] (SPARK-8780) Move Python doctest code example from models to algorithms

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-8780: --- Labels: starter (was: ) > Move Python doctest code example from models to algorithms > --

[jira] [Assigned] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-10-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17647: Assignee: Apache Spark > SQL LIKE does not handle backslashes correctly >

[jira] [Assigned] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-10-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17647: Assignee: (was: Apache Spark) > SQL LIKE does not handle backslashes correctly > -

[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-10-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556767#comment-15556767 ] Apache Spark commented on SPARK-17647: -- User 'jodersky' has created a pull request f

[jira] [Commented] (SPARK-8780) Move Python doctest code example from models to algorithms

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556763#comment-15556763 ] holdenk commented on SPARK-8780: Is this something we still want to do? This could be a gr

[jira] [Commented] (SPARK-6831) Document how to use external data sources

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556758#comment-15556758 ] holdenk commented on SPARK-6831: Is this something we are planning to do at all? It doesn'

[jira] [Closed] (SPARK-6780) Add saveAsTextFileByKey method for PySpark

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-6780. -- Resolution: Won't Fix Since SPARK-3533 is WON'T FIX this one should be to. > Add saveAsTextFileByKey method for

[jira] [Closed] (SPARK-7613) Serialization fails in pyspark for lambdas referencing class data members

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-7613. -- Resolution: Won't Fix I believe this is expected behaviour and the current best practice is simply to make a lo

[jira] [Commented] (SPARK-7638) Python API for pmml.export

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556732#comment-15556732 ] holdenk commented on SPARK-7638: Do we still want to do this or focus on adding PMML expor

[jira] [Commented] (SPARK-6174) Improve doc: Python ALS, MatrixFactorizationModel

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556720#comment-15556720 ] holdenk commented on SPARK-6174: I think Bryan did a good job of this I'd be in favour of

[jira] [Commented] (SPARK-5981) pyspark ML models should support predict/transform on vector within map

2016-10-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556714#comment-15556714 ] holdenk commented on SPARK-5981: I'm not sure porting the models to Python sounds like a g

  1   2   3   >