[jira] [Commented] (SPARK-17163) Decide on unified multinomial and binary logistic regression interfaces

2016-08-24 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436335#comment-15436335 ] DB Tsai commented on SPARK-17163: - I voted for merging into one interface as well. Since binary LOR can

[jira] [Commented] (SPARK-17163) Decide on unified multinomial and binary logistic regression interfaces

2016-08-24 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436326#comment-15436326 ] Seth Hendrickson commented on SPARK-17163: -- Good catch, thanks! > Decide on unified multinomial

[jira] [Commented] (SPARK-17201) Investigate numerical instability for MLOR without regularization

2016-08-24 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436318#comment-15436318 ] DB Tsai commented on SPARK-17201: - This makes sense. Let's keep an eye on this, and figure out the

[jira] [Issue Comment Deleted] (SPARK-17232) Expecting same behavior after loading a dataframe with dots in column name

2016-08-24 Thread Jagadeesan A S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jagadeesan A S updated SPARK-17232: --- Comment: was deleted (was: I'm not able to reproduce the issue. {code:xml} scala>

[jira] [Commented] (SPARK-17232) Expecting same behavior after loading a dataframe with dots in column name

2016-08-24 Thread Jagadeesan A S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436289#comment-15436289 ] Jagadeesan A S commented on SPARK-17232: I'm not able to reproduce the issue. {code:xml} scala>

[jira] [Comment Edited] (SPARK-17232) Expecting same behavior after loading a dataframe with dots in column name

2016-08-24 Thread Jagadeesan A S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436285#comment-15436285 ] Jagadeesan A S edited comment on SPARK-17232 at 8/25/16 5:01 AM: - I'm not

[jira] [Commented] (SPARK-17232) Expecting same behavior after loading a dataframe with dots in column name

2016-08-24 Thread Jagadeesan A S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436285#comment-15436285 ] Jagadeesan A S commented on SPARK-17232: I'm not able to reproduce the issue. {code:xml} scala>

[jira] [Updated] (SPARK-17190) Removal of HiveSharedState

2016-08-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17190: Assignee: Xiao Li > Removal of HiveSharedState > -- > >

[jira] [Resolved] (SPARK-17190) Removal of HiveSharedState

2016-08-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17190. - Issue resolved by pull request 14757 [https://github.com/apache/spark/pull/14757] > Removal of

[jira] [Updated] (SPARK-14381) Review spark.ml parity for feature transformers

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-14381: Fix Version/s: (was: 2.1.0) > Review spark.ml parity for feature transformers >

[jira] [Commented] (SPARK-14381) Review spark.ml parity for feature transformers

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436268#comment-15436268 ] Yanbo Liang commented on SPARK-14381: - Resolved this, thanks for working on it. > Review spark.ml

[jira] [Resolved] (SPARK-14381) Review spark.ml parity for feature transformers

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-14381. - Resolution: Done Assignee: Xusen Yin Fix Version/s: 2.1.0 > Review spark.ml

[jira] [Comment Edited] (SPARK-14378) Review spark.ml parity for regression, except trees

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436264#comment-15436264 ] Yanbo Liang edited comment on SPARK-14378 at 8/25/16 4:30 AM: -- *

[jira] [Comment Edited] (SPARK-14378) Review spark.ml parity for regression, except trees

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436264#comment-15436264 ] Yanbo Liang edited comment on SPARK-14378 at 8/25/16 4:29 AM: -- *

[jira] [Comment Edited] (SPARK-14378) Review spark.ml parity for regression, except trees

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436264#comment-15436264 ] Yanbo Liang edited comment on SPARK-14378 at 8/25/16 4:26 AM: -- *

[jira] [Assigned] (SPARK-14378) Review spark.ml parity for regression, except trees

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang reassigned SPARK-14378: --- Assignee: Yanbo Liang > Review spark.ml parity for regression, except trees >

[jira] [Comment Edited] (SPARK-14378) Review spark.ml parity for regression, except trees

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436264#comment-15436264 ] Yanbo Liang edited comment on SPARK-14378 at 8/25/16 4:25 AM: -- *

[jira] [Resolved] (SPARK-17228) Not infer/propagate non-deterministic constraints

2016-08-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17228. - Resolution: Fixed Assignee: Sameer Agarwal Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-14378) Review spark.ml parity for regression, except trees

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436264#comment-15436264 ] Yanbo Liang commented on SPARK-14378: - *

[jira] [Updated] (SPARK-17066) dateFormat should be used when writing dataframes as csv files

2016-08-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17066: Fix Version/s: 2.1.0 2.0.1 > dateFormat should be used when writing dataframes

[jira] [Updated] (SPARK-16597) DataFrame DateType is written as an int(Days since epoch) by csv writer

2016-08-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16597: Fix Version/s: 2.1.0 2.0.1 > DataFrame DateType is written as an int(Days since

[jira] [Resolved] (SPARK-16216) CSV data source does not write date and timestamp correctly

2016-08-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-16216. - Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 > CSV data source does

[jira] [Updated] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-17204: --- Description: We use the OFF_HEAP storage level extensively with great success. We've tried

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436222#comment-15436222 ] Saisai Shao commented on SPARK-17204: - Yes, I could reproduce this issue, but not constantly,

[jira] [Comment Edited] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436218#comment-15436218 ] Michael Allman edited comment on SPARK-17204 at 8/25/16 3:36 AM: - I would

[jira] [Comment Edited] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436218#comment-15436218 ] Michael Allman edited comment on SPARK-17204 at 8/25/16 3:37 AM: - I would

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436218#comment-15436218 ] Michael Allman commented on SPARK-17204: I would think that, but `sc.range(0, 0)` throws the

[jira] [Updated] (SPARK-17233) Shuffle file will be left over the capacity when dynamic schedule is enabled in a long running case.

2016-08-24 Thread carlmartin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] carlmartin updated SPARK-17233: --- Description: When I execute some sql statement periodically in the long running thriftserver, I

[jira] [Updated] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-17204: --- Description: We use the OFF_HEAP storage level extensively with great success. We've tried

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436213#comment-15436213 ] Saisai Shao commented on SPARK-17204: - I think to reflect the issue {{sc.range(0, 0)}} should be

[jira] [Updated] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-17204: --- Description: We use the OFF_HEAP storage level extensively with great success. We've tried

[jira] [Created] (SPARK-17233) Shuffle file will be left over the capacity when dynamic schedule is enabled in a long running case.

2016-08-24 Thread carlmartin (JIRA)
carlmartin created SPARK-17233: -- Summary: Shuffle file will be left over the capacity when dynamic schedule is enabled in a long running case. Key: SPARK-17233 URL: https://issues.apache.org/jira/browse/SPARK-17233

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436205#comment-15436205 ] Saisai Shao commented on SPARK-17204: - No, I tested in yarn cluster, not local mode. > Spark 2.0 off

[jira] [Comment Edited] (SPARK-17163) Decide on unified multinomial and binary logistic regression interfaces

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436191#comment-15436191 ] Yanbo Liang edited comment on SPARK-17163 at 8/25/16 3:22 AM: -- Exposing a

[jira] [Comment Edited] (SPARK-17163) Decide on unified multinomial and binary logistic regression interfaces

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436191#comment-15436191 ] Yanbo Liang edited comment on SPARK-17163 at 8/25/16 3:22 AM: -- Exposing a

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436202#comment-15436202 ] Michael Allman commented on SPARK-17204: Hi [~jerryshao]. I wonder if you're testing in local

[jira] [Comment Edited] (SPARK-17163) Decide on unified multinomial and binary logistic regression interfaces

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436191#comment-15436191 ] Yanbo Liang edited comment on SPARK-17163 at 8/25/16 3:14 AM: -- Exposing a

[jira] [Comment Edited] (SPARK-17163) Decide on unified multinomial and binary logistic regression interfaces

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434412#comment-15434412 ] Yanbo Liang edited comment on SPARK-17163 at 8/25/16 3:12 AM: -- I think it's

[jira] [Commented] (SPARK-17163) Decide on unified multinomial and binary logistic regression interfaces

2016-08-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436191#comment-15436191 ] Yanbo Liang commented on SPARK-17163: - Exposing a {{family}} or similar parameter to control pivoting

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436179#comment-15436179 ] Saisai Shao commented on SPARK-17204: - It works OK in my local test with latest build: {code} val

[jira] [Commented] (SPARK-15382) monotonicallyIncreasingId doesn't work when data is upsampled

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436172#comment-15436172 ] Apache Spark commented on SPARK-15382: -- User 'maropu' has created a pull request for this issue:

[jira] [Commented] (SPARK-15382) monotonicallyIncreasingId doesn't work when data is upsampled

2016-08-24 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436171#comment-15436171 ] Takeshi Yamamuro commented on SPARK-15382: -- Sorry, but the master still has this bug. I made a

[jira] [Commented] (SPARK-17226) Allow defining multiple date formats per column in csv

2016-08-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436158#comment-15436158 ] Hyukjin Kwon commented on SPARK-17226: -- Codes to reproduce and suggestion maybe rather than just

[jira] [Commented] (SPARK-17222) Support multline csv records

2016-08-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436154#comment-15436154 ] Hyukjin Kwon commented on SPARK-17222: -- Here is *related* PR

[jira] [Commented] (SPARK-17227) Allow configuring record delimiter in csv

2016-08-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436150#comment-15436150 ] Hyukjin Kwon commented on SPARK-17227: -- Ah, SPARK-17222 is about miltiple-lines but IMHO it might

[jira] [Commented] (SPARK-17227) Allow configuring record delimiter in csv

2016-08-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436147#comment-15436147 ] Hyukjin Kwon commented on SPARK-17227: -- We may have to open a JIRA to deal with multiple-lines

[jira] [Commented] (SPARK-17219) QuantileDiscretizer does strange things with NaN values

2016-08-24 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436143#comment-15436143 ] Vincent commented on SPARK-17219: - for this scenario, we can add a new parameter for QuantileDiscretizer,

[jira] [Commented] (SPARK-17227) Allow configuring record delimiter in csv

2016-08-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436140#comment-15436140 ] Hyukjin Kwon commented on SPARK-17227: -- Also, it would be great if the JIRA has an example and

[jira] [Commented] (SPARK-17227) Allow configuring record delimiter in csv

2016-08-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436139#comment-15436139 ] Hyukjin Kwon commented on SPARK-17227: -- If I remember this correctly, we are not using that

[jira] [Commented] (SPARK-17219) QuantileDiscretizer does strange things with NaN values

2016-08-24 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436136#comment-15436136 ] Vincent commented on SPARK-17219: - for cases where only null and non-null buckets are needed, I guess we

[jira] [Commented] (SPARK-16216) CSV data source does not write date and timestamp correctly

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436124#comment-15436124 ] Apache Spark commented on SPARK-16216: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Commented] (SPARK-17110) Pyspark with locality ANY throw java.io.StreamCorruptedException

2016-08-24 Thread Gen TANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436120#comment-15436120 ] Gen TANG commented on SPARK-17110: -- [~radost...@gmail.com], It seems spark scala doesn't have this bug

[jira] [Commented] (SPARK-17163) Decide on unified multinomial and binary logistic regression interfaces

2016-08-24 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436033#comment-15436033 ] Seth Hendrickson commented on SPARK-17163: -- BTW, I am happy to take care of merging the

[jira] [Commented] (SPARK-17156) Add multiclass logistic regression Scala Example

2016-08-24 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436004#comment-15436004 ] Miao Wang commented on SPARK-17156: --- Two quick comments: 1). Add some comments like in the

[jira] [Commented] (SPARK-17157) Add multiclass logistic regression SparkR Wrapper

2016-08-24 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435996#comment-15435996 ] Miao Wang commented on SPARK-17157: --- Start working on it now. > Add multiclass logistic regression

[jira] [Assigned] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17231: Assignee: (was: Apache Spark) > Avoid building debug or trace log messages unless the

[jira] [Assigned] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17231: Assignee: Apache Spark > Avoid building debug or trace log messages unless the respective

[jira] [Commented] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435958#comment-15435958 ] Apache Spark commented on SPARK-17231: -- User 'mallman' has created a pull request for this issue:

[jira] [Updated] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-17231: --- Description: While debugging the performance of a large GraphX connected components

[jira] [Comment Edited] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435767#comment-15435767 ] Michael Allman edited comment on SPARK-17231 at 8/24/16 11:29 PM: -- I've

[jira] [Updated] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-17231: --- Attachment: master 2.jpg logging_perf_improvements 2.jpg > Avoid building

[jira] [Created] (SPARK-17232) Expecting same behavior after loading a dataframe with dots in column name

2016-08-24 Thread Louis Salin (JIRA)
Louis Salin created SPARK-17232: --- Summary: Expecting same behavior after loading a dataframe with dots in column name Key: SPARK-17232 URL: https://issues.apache.org/jira/browse/SPARK-17232 Project:

[jira] [Updated] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-17231: --- Description: While debugging the performance of a large GraphX connected components

[jira] [Updated] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-17231: --- Description: While debugging the performance of a large GraphX connected components

[jira] [Commented] (SPARK-17123) Performing set operations that combine string and date / timestamp columns may result in generated projection code which doesn't compile

2016-08-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435773#comment-15435773 ] Dongjoon Hyun commented on SPARK-17123: --- Hi, [~joshrosen]. I'll make a PR for this. > Performing

[jira] [Updated] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-17231: --- Description: While debugging the performance of a large GraphX connected components

[jira] [Updated] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-17231: --- Description: While debugging the performance of a large GraphX connected components

[jira] [Commented] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435767#comment-15435767 ] Michael Allman commented on SPARK-17231: Note that in the attached screenshots, all stats are the

[jira] [Updated] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-17231: --- Attachment: logging_perf_improvements.jpg master.jpg > Avoid building debug

[jira] [Updated] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-17231: --- Description: While debugging the performance of a large GraphX connected components

[jira] [Created] (SPARK-17231) Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-24 Thread Michael Allman (JIRA)
Michael Allman created SPARK-17231: -- Summary: Avoid building debug or trace log messages unless the respective log level is enabled Key: SPARK-17231 URL: https://issues.apache.org/jira/browse/SPARK-17231

[jira] [Commented] (SPARK-17211) Broadcast join produces incorrect results

2016-08-24 Thread Himanish Kushary (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435745#comment-15435745 ] Himanish Kushary commented on SPARK-17211: -- I ran the following in a Databricks environment with

[jira] [Assigned] (SPARK-17230) Writing decimal to csv will result empty string if the decimal exceeds (20, 18)

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17230: Assignee: Davies Liu (was: Apache Spark) > Writing decimal to csv will result empty

[jira] [Assigned] (SPARK-17230) Writing decimal to csv will result empty string if the decimal exceeds (20, 18)

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17230: Assignee: Apache Spark (was: Davies Liu) > Writing decimal to csv will result empty

[jira] [Commented] (SPARK-17230) Writing decimal to csv will result empty string if the decimal exceeds (20, 18)

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435725#comment-15435725 ] Apache Spark commented on SPARK-17230: -- User 'davies' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17120) Analyzer incorrectly optimizes plan to empty LocalRelation

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17120: Assignee: Apache Spark > Analyzer incorrectly optimizes plan to empty LocalRelation >

[jira] [Assigned] (SPARK-17120) Analyzer incorrectly optimizes plan to empty LocalRelation

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17120: Assignee: (was: Apache Spark) > Analyzer incorrectly optimizes plan to empty

[jira] [Commented] (SPARK-17226) Allow defining multiple date formats per column in csv

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435709#comment-15435709 ] Robert Kruszewski commented on SPARK-17226: --- Anything in particular you have in mind? I should

[jira] [Assigned] (SPARK-17099) Incorrect result when HAVING clause is added to group by query

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17099: Assignee: (was: Apache Spark) > Incorrect result when HAVING clause is added to group

[jira] [Commented] (SPARK-17120) Analyzer incorrectly optimizes plan to empty LocalRelation

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435714#comment-15435714 ] Apache Spark commented on SPARK-17120: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Commented] (SPARK-17099) Incorrect result when HAVING clause is added to group by query

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435711#comment-15435711 ] Apache Spark commented on SPARK-17099: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17099) Incorrect result when HAVING clause is added to group by query

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17099: Assignee: Apache Spark > Incorrect result when HAVING clause is added to group by query >

[jira] [Updated] (SPARK-17226) Allow defining multiple date formats per column in csv

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17226: -- Description: Useful to have fallbacks in case of messy input and different columns can

[jira] [Updated] (SPARK-17225) Support multiple null values in csv files

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17225: -- Component/s: SQL > Support multiple null values in csv files >

[jira] [Updated] (SPARK-17224) Support skipping multiple header rows in csv

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17224: -- Component/s: SQL > Support skipping multiple header rows in csv >

[jira] [Updated] (SPARK-17225) Support multiple null values in csv files

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17225: -- Affects Version/s: 2.0.0 > Support multiple null values in csv files >

[jira] [Updated] (SPARK-17226) Allow defining multiple date formats per column in csv

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17226: -- Component/s: SQL > Allow defining multiple date formats per column in csv >

[jira] [Updated] (SPARK-17224) Support skipping multiple header rows in csv

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17224: -- Affects Version/s: 2.0.0 > Support skipping multiple header rows in csv >

[jira] [Updated] (SPARK-17226) Allow defining multiple date formats per column in csv

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17226: -- Affects Version/s: 2.0.0 > Allow defining multiple date formats per column in csv >

[jira] [Updated] (SPARK-17227) Allow configuring record delimiter in csv

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17227: -- Affects Version/s: 2.0.0 > Allow configuring record delimiter in csv >

[jira] [Updated] (SPARK-17227) Allow configuring record delimiter in csv

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17227: -- Component/s: SQL > Allow configuring record delimiter in csv >

[jira] [Updated] (SPARK-17222) Support multline csv records

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17222: -- Affects Version/s: 2.0.0 > Support multline csv records >

[jira] [Updated] (SPARK-17222) Support multline csv records

2016-08-24 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17222: -- Component/s: SQL > Support multline csv records > > >

[jira] [Updated] (SPARK-16334) SQL query on parquet table java.lang.ArrayIndexOutOfBoundsException

2016-08-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16334: -- Labels: (was: sql) Fix Version/s: (was: 2.0.1) (was: 2.1.0)

[jira] [Assigned] (SPARK-17229) Postgres JDBC dialect should not widen float and short types during reads

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17229: Assignee: Apache Spark (was: Josh Rosen) > Postgres JDBC dialect should not widen float

[jira] [Commented] (SPARK-17229) Postgres JDBC dialect should not widen float and short types during reads

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435665#comment-15435665 ] Apache Spark commented on SPARK-17229: -- User 'JoshRosen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17229) Postgres JDBC dialect should not widen float and short types during reads

2016-08-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17229: Assignee: Josh Rosen (was: Apache Spark) > Postgres JDBC dialect should not widen float

[jira] [Created] (SPARK-17230) Writing decimal to csv will result empty string if the decimal exceeds (20, 18)

2016-08-24 Thread Davies Liu (JIRA)
Davies Liu created SPARK-17230: -- Summary: Writing decimal to csv will result empty string if the decimal exceeds (20, 18) Key: SPARK-17230 URL: https://issues.apache.org/jira/browse/SPARK-17230 Project:

[jira] [Commented] (SPARK-17219) QuantileDiscretizer does strange things with NaN values

2016-08-24 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435651#comment-15435651 ] Barry Becker commented on SPARK-17219: -- If the decision is to have an additional null/NaN bucket,

[jira] [Created] (SPARK-17229) Postgres JDBC dialect should not widen float and short types during reads

2016-08-24 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-17229: -- Summary: Postgres JDBC dialect should not widen float and short types during reads Key: SPARK-17229 URL: https://issues.apache.org/jira/browse/SPARK-17229 Project: Spark

  1   2   3   >