[jira] [Commented] (SPARK-18084) write.partitionBy() does not recognize nested columns that select() can access

2016-10-25 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606484#comment-15606484 ] Nicholas Chammas commented on SPARK-18084: -- cc [~marmbrus] - Dunno if this is actually bug or

[jira] [Updated] (SPARK-18084) write.partitionBy() does not recognize nested columns that select() can access

2016-10-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-18084: - Issue Type: Bug (was: Improvement) > write.partitionBy() does not recognize nested

[jira] [Created] (SPARK-18084) write.partitionBy() does not recognize nested columns that select() can access

2016-10-24 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-18084: Summary: write.partitionBy() does not recognize nested columns that select() can access Key: SPARK-18084 URL: https://issues.apache.org/jira/browse/SPARK-18084

[jira] [Commented] (SPARK-12757) Use reference counting to prevent blocks from being evicted during reads

2016-10-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603211#comment-15603211 ] Nicholas Chammas commented on SPARK-12757: -- Just to link back, [~josephkb] is reporting that

[jira] [Closed] (SPARK-17976) Global options to spark-submit should not be position-sensitive

2016-10-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas closed SPARK-17976. Resolution: Not A Problem Ah, makes perfect sense. Would have realized that myself if I

[jira] [Created] (SPARK-17976) Global options to spark-submit should not be position-sensitive

2016-10-17 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-17976: Summary: Global options to spark-submit should not be position-sensitive Key: SPARK-17976 URL: https://issues.apache.org/jira/browse/SPARK-17976 Project:

[jira] [Comment Edited] (SPARK-14742) Redirect spark-ec2 doc to new location

2016-08-31 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15452150#comment-15452150 ] Nicholas Chammas edited comment on SPARK-14742 at 8/31/16 12:50 PM:

[jira] [Commented] (SPARK-14742) Redirect spark-ec2 doc to new location

2016-08-31 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15452150#comment-15452150 ] Nicholas Chammas commented on SPARK-14742: -- Sounds good to me. > Redirect spark-ec2 doc to new

[jira] [Commented] (SPARK-14742) Redirect spark-ec2 doc to new location

2016-08-30 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450626#comment-15450626 ] Nicholas Chammas commented on SPARK-14742: -- {quote} Otherwise the only way to get to this link

[jira] [Commented] (SPARK-14742) Redirect spark-ec2 doc to new location

2016-08-30 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450602#comment-15450602 ] Nicholas Chammas commented on SPARK-14742: -- http://spark.apache.org/docs/latest/ec2-scripts.html

[jira] [Updated] (SPARK-17220) Upgrade Py4J to 0.10.3

2016-08-26 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-17220: - Component/s: PySpark > Upgrade Py4J to 0.10.3 > -- > >

[jira] [Commented] (SPARK-14241) Output of monotonically_increasing_id lacks stable relation with rows of DataFrame

2016-08-25 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437616#comment-15437616 ] Nicholas Chammas commented on SPARK-14241: -- [~marmbrus] - Would it be tough to make this

[jira] [Commented] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428988#comment-15428988 ] Nicholas Chammas commented on SPARK-17025: -- {quote} We'd need to figure out a good design for

[jira] [Comment Edited] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417788#comment-15417788 ] Nicholas Chammas edited comment on SPARK-17025 at 8/11/16 7:33 PM: --- cc

[jira] [Comment Edited] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417788#comment-15417788 ] Nicholas Chammas edited comment on SPARK-17025 at 8/11/16 7:27 PM: --- cc

[jira] [Commented] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417788#comment-15417788 ] Nicholas Chammas commented on SPARK-17025: -- cc [~josephkb] [~mengxr] > Cannot persist PySpark

[jira] [Created] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-17025: Summary: Cannot persist PySpark ML Pipeline model that includes custom Transformer Key: SPARK-17025 URL: https://issues.apache.org/jira/browse/SPARK-17025

[jira] [Commented] (SPARK-16921) RDD/DataFrame persist() and cache() should return Python context managers

2016-08-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414067#comment-15414067 ] Nicholas Chammas commented on SPARK-16921: -- [~holdenk] - Probably won't be able to do it myself

[jira] [Created] (SPARK-16921) RDD/DataFrame persist() and cache() should return Python context managers

2016-08-05 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-16921: Summary: RDD/DataFrame persist() and cache() should return Python context managers Key: SPARK-16921 URL: https://issues.apache.org/jira/browse/SPARK-16921

[jira] [Closed] (SPARK-7505) Update PySpark DataFrame docs: encourage __getitem__, mark as experimental, etc.

2016-08-05 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas closed SPARK-7505. --- Resolution: Invalid Closing this as invalid as I believe these issues are no longer

[jira] [Commented] (SPARK-5312) Use sbt to detect new or changed public classes in PRs

2016-08-05 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409767#comment-15409767 ] Nicholas Chammas commented on SPARK-5312: - [~boyork] - Shall we close this? It doesn't look like

[jira] [Comment Edited] (SPARK-7146) Should ML sharedParams be a public API?

2016-08-02 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405300#comment-15405300 ] Nicholas Chammas edited comment on SPARK-7146 at 8/3/16 4:45 AM: - A quick

[jira] [Commented] (SPARK-7146) Should ML sharedParams be a public API?

2016-08-02 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405300#comment-15405300 ] Nicholas Chammas commented on SPARK-7146: - A quick update from a PySpark user: I am using

[jira] [Commented] (SPARK-16782) Use Sphinx autodoc to eliminate duplication of Python docstrings

2016-08-01 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402515#comment-15402515 ] Nicholas Chammas commented on SPARK-16782: -- Poking around a bit more, it seems like a possible

[jira] [Commented] (SPARK-16782) Use Sphinx autodoc to eliminate duplication of Python docstrings

2016-08-01 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402477#comment-15402477 ] Nicholas Chammas commented on SPARK-16782: -- Hmm never mind. I think I've misunderstood the

[jira] [Closed] (SPARK-16782) Use Sphinx autodoc to eliminate duplication of Python docstrings

2016-08-01 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas closed SPARK-16782. Resolution: Invalid > Use Sphinx autodoc to eliminate duplication of Python docstrings >

[jira] [Commented] (SPARK-12157) Support numpy types as return values of Python UDFs

2016-07-31 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401198#comment-15401198 ] Nicholas Chammas commented on SPARK-12157: -- OK. I've raised the issue of documenting this in

[jira] [Commented] (SPARK-16824) Add API docs for VectorUDT

2016-07-31 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401197#comment-15401197 ] Nicholas Chammas commented on SPARK-16824: -- cc [~josephkb] [~mengxr] - Should this type be

[jira] [Created] (SPARK-16824) Add API docs for VectorUDT

2016-07-31 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-16824: Summary: Add API docs for VectorUDT Key: SPARK-16824 URL: https://issues.apache.org/jira/browse/SPARK-16824 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-12157) Support numpy types as return values of Python UDFs

2016-07-31 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401179#comment-15401179 ] Nicholas Chammas commented on SPARK-12157: -- Thanks for the pointer, Maciej. It appears that

[jira] [Commented] (SPARK-12157) Support numpy types as return values of Python UDFs

2016-07-29 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399743#comment-15399743 ] Nicholas Chammas commented on SPARK-12157: -- It appears that it's not possible to have a UDF that

[jira] [Commented] (SPARK-12157) Support numpy types as return values of Python UDFs

2016-07-28 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398267#comment-15398267 ] Nicholas Chammas commented on SPARK-12157: -- I'm looking to define a UDF in PySpark that returns

[jira] [Commented] (SPARK-16782) Use Sphinx autodoc to eliminate duplication of Python docstrings

2016-07-28 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398059#comment-15398059 ] Nicholas Chammas commented on SPARK-16782: -- [~davies] [~joshrosen] - I can take this on if the

[jira] [Created] (SPARK-16782) Use Sphinx autodoc to eliminate duplication of Python docstrings

2016-07-28 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-16782: Summary: Use Sphinx autodoc to eliminate duplication of Python docstrings Key: SPARK-16782 URL: https://issues.apache.org/jira/browse/SPARK-16782 Project:

[jira] [Updated] (SPARK-16772) Correct API doc references to PySpark classes + formatting fixes

2016-07-28 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16772: - Summary: Correct API doc references to PySpark classes + formatting fixes (was: Correct

[jira] [Updated] (SPARK-16772) Correct API doc references to PySpark classes

2016-07-28 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16772: - Summary: Correct API doc references to PySpark classes (was: Correct API doc references

[jira] [Created] (SPARK-16772) Correct API doc references to DataType + other minor doc tweaks

2016-07-28 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-16772: Summary: Correct API doc references to DataType + other minor doc tweaks Key: SPARK-16772 URL: https://issues.apache.org/jira/browse/SPARK-16772 Project:

[jira] [Commented] (SPARK-7481) Add spark-cloud module to pull in aws+azure object store FS accessors; test integration

2016-07-22 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389589#comment-15389589 ] Nicholas Chammas commented on SPARK-7481: - [~ste...@apache.org] - Some relevant reading for you

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-07-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384535#comment-15384535 ] Nicholas Chammas commented on SPARK-12661: -- OK, sounds good to me. > Drop Python 2.6 support in

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-07-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384513#comment-15384513 ] Nicholas Chammas commented on SPARK-12661: -- Yes, I mean communicating our intention to drop

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-07-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384343#comment-15384343 ] Nicholas Chammas commented on SPARK-12661: -- To clarify what I mean by drop vs. deprecate,

[jira] [Comment Edited] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-07-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384343#comment-15384343 ] Nicholas Chammas edited comment on SPARK-12661 at 7/19/16 3:23 PM: --- To

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-07-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384339#comment-15384339 ] Nicholas Chammas commented on SPARK-12661: -- Just double-checking on something: Is it OK to drop

[jira] [Closed] (SPARK-16427) Expand documentation on the various RDD storage levels

2016-07-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas closed SPARK-16427. Resolution: Invalid > Expand documentation on the various RDD storage levels >

[jira] [Commented] (SPARK-16427) Expand documentation on the various RDD storage levels

2016-07-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371321#comment-15371321 ] Nicholas Chammas commented on SPARK-16427: -- Oh nevermind, this information is all available

[jira] [Commented] (SPARK-16427) Expand documentation on the various RDD storage levels

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366711#comment-15366711 ] Nicholas Chammas commented on SPARK-16427: -- My first question about this would be, how many

[jira] [Updated] (SPARK-3181) Add Robust Regression Algorithm with Huber Estimator

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-3181: Component/s: (was: MLilb) MLlib > Add Robust Regression Algorithm with

[jira] [Updated] (SPARK-16156) RowMatrıx Covariance

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16156: - Component/s: (was: MLilb) MLlib > RowMatrıx Covariance >

[jira] [Updated] (SPARK-16074) Expose VectorUDT/MatrixUDT in a public API

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16074: - Component/s: (was: MLilb) MLlib > Expose VectorUDT/MatrixUDT in a

[jira] [Updated] (SPARK-16377) Spark MLlib: MultilayerPerceptronClassifier - error while training

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16377: - Component/s: (was: MLilb) MLlib > Spark MLlib:

[jira] [Updated] (SPARK-16290) text type features column for classification

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16290: - Component/s: (was: MLilb) MLlib > text type features column for

[jira] [Updated] (SPARK-16232) Getting error by making columns using DataFrame

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16232: - Component/s: (was: MLilb) MLlib > Getting error by making columns

[jira] [Created] (SPARK-16427) Expand documentation on the various RDD storage levels

2016-07-07 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-16427: Summary: Expand documentation on the various RDD storage levels Key: SPARK-16427 URL: https://issues.apache.org/jira/browse/SPARK-16427 Project: Spark

[jira] [Commented] (SPARK-15760) Documentation missing for package-related config options

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366702#comment-15366702 ] Nicholas Chammas commented on SPARK-15760: -- Updating component since it seems we are using

[jira] [Updated] (SPARK-15441) dataset outer join seems to return incorrect result

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15441: - Component/s: (was: sq;) SQL > dataset outer join seems to return

[jira] [Updated] (SPARK-15772) Improve Scala API docs

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15772: - Component/s: (was: docs) > Improve Scala API docs > --- > >

[jira] [Updated] (SPARK-15760) Documentation missing for package-related config options

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15760: - Component/s: (was: docs) Documentation > Documentation missing for

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2016-06-30 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357535#comment-15357535 ] Nicholas Chammas commented on SPARK-: - {quote} Python itself has no compile time type safety.

[jira] [Commented] (SPARK-11744) bin/pyspark --version doesn't return version and exit

2016-06-23 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15347297#comment-15347297 ] Nicholas Chammas commented on SPARK-11744: -- This is not the appropriate place to ask random

[jira] [Commented] (SPARK-3821) Develop an automated way of creating Spark images (AMI, Docker, and others)

2016-05-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292599#comment-15292599 ] Nicholas Chammas commented on SPARK-3821: - You can deploy Spark today on Docker just fine. It's

[jira] [Commented] (SPARK-3821) Develop an automated way of creating Spark images (AMI, Docker, and others)

2016-05-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292198#comment-15292198 ] Nicholas Chammas commented on SPARK-3821: - Not sure if there is renewed interest, but at this

[jira] [Commented] (SPARK-15072) Remove SparkSession.withHiveSupport

2016-05-16 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285112#comment-15285112 ] Nicholas Chammas commented on SPARK-15072: -- Brief note from [~yhuai] on the motivation behind

[jira] [Commented] (SPARK-10899) Support JDBC pushdown for additional commands

2016-05-12 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282239#comment-15282239 ] Nicholas Chammas commented on SPARK-10899: -- Is {{COUNT}} also something that can be pushed down?

[jira] [Comment Edited] (SPARK-7506) pyspark.sql.types.StructType.fromJson() is incorrectly named

2016-05-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280606#comment-15280606 ] Nicholas Chammas edited comment on SPARK-7506 at 5/11/16 6:51 PM: --

[jira] [Commented] (SPARK-7506) pyspark.sql.types.StructType.fromJson() is incorrectly named

2016-05-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280606#comment-15280606 ] Nicholas Chammas commented on SPARK-7506: - [~davies] - Would you be interested in a PR that adds

[jira] [Updated] (SPARK-15256) Clarify the docstring for DataFrameReader.jdbc()

2016-05-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15256: - Description: The doc for the {{properties}} parameter [currently

[jira] [Updated] (SPARK-15256) Clarify the docstring for DataFrameReader.jdbc()

2016-05-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15256: - Summary: Clarify the docstring for DataFrameReader.jdbc() (was: Correct the docstring

[jira] [Created] (SPARK-15256) Correct the docstring for DataFrameReader.jdbc()

2016-05-10 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-15256: Summary: Correct the docstring for DataFrameReader.jdbc() Key: SPARK-15256 URL: https://issues.apache.org/jira/browse/SPARK-15256 Project: Spark

[jira] [Comment Edited] (SPARK-15193) samplingRatio should default to 1.0 across the board

2016-05-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278393#comment-15278393 ] Nicholas Chammas edited comment on SPARK-15193 at 5/10/16 4:27 PM: ---

[jira] [Commented] (SPARK-15193) samplingRatio should default to 1.0 across the board

2016-05-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278393#comment-15278393 ] Nicholas Chammas commented on SPARK-15193: -- Nope, a sampling ratio of 1.0 and None mean

[jira] [Created] (SPARK-15238) Clarify Python 3 support in docs

2016-05-09 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-15238: Summary: Clarify Python 3 support in docs Key: SPARK-15238 URL: https://issues.apache.org/jira/browse/SPARK-15238 Project: Spark Issue Type:

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-05-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277450#comment-15277450 ] Nicholas Chammas commented on SPARK-12661: -- [~davies] / [~joshrosen] - Has this been settled on?

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-05-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277448#comment-15277448 ] Nicholas Chammas commented on SPARK-12661: -- [~shivaram] - Can you confirm that spark-ec2 will

[jira] [Commented] (SPARK-15204) Nullable is not correct for Aggregator

2016-05-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275394#comment-15275394 ] Nicholas Chammas commented on SPARK-15204: -- Loosely related: SPARK-15191 > Nullable is not

[jira] [Updated] (SPARK-15191) createDataFrame() should mark fields that are known not to be null as not nullable

2016-05-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15191: - Affects Version/s: 1.6.1 > createDataFrame() should mark fields that are known not to be

[jira] [Commented] (SPARK-15191) createDataFrame() should mark fields that are known not to be null as not nullable

2016-05-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274785#comment-15274785 ] Nicholas Chammas commented on SPARK-15191: -- [~yhuai] - This loosely relates to the discussion in

[jira] [Commented] (SPARK-15193) samplingRatio should default to 1.0 across the board

2016-05-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274783#comment-15274783 ] Nicholas Chammas commented on SPARK-15193: -- [~yhuai] - What do you think of this proposed

[jira] [Created] (SPARK-15193) samplingRatio should default to 1.0 across the board

2016-05-06 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-15193: Summary: samplingRatio should default to 1.0 across the board Key: SPARK-15193 URL: https://issues.apache.org/jira/browse/SPARK-15193 Project: Spark

[jira] [Created] (SPARK-15191) createDataFrame() should mark fields that are known not to be null as not nullable

2016-05-06 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-15191: Summary: createDataFrame() should mark fields that are known not to be null as not nullable Key: SPARK-15191 URL: https://issues.apache.org/jira/browse/SPARK-15191

[jira] [Commented] (SPARK-13740) add null check for _verify_type in types.py

2016-05-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274738#comment-15274738 ] Nicholas Chammas commented on SPARK-13740: -- I noticed the PR only modifies PySpark. Are similar

[jira] [Commented] (SPARK-11319) PySpark silently accepts null values in non-nullable DataFrame fields.

2016-05-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274728#comment-15274728 ] Nicholas Chammas commented on SPARK-11319: -- [~marmbrus] / [~yhuai] - Does SPARK-13740 resolve

[jira] [Commented] (SPARK-14932) Allow DataFrame.replace() to replace values with None

2016-04-26 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259266#comment-15259266 ] Nicholas Chammas commented on SPARK-14932: -- [~marmbrus] - Not sure if you're a good person to

[jira] [Created] (SPARK-14932) Allow DataFrame.replace() to replace values with None

2016-04-26 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-14932: Summary: Allow DataFrame.replace() to replace values with None Key: SPARK-14932 URL: https://issues.apache.org/jira/browse/SPARK-14932 Project: Spark

[jira] [Created] (SPARK-14742) Redirect spark-ec2 doc to new location

2016-04-19 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-14742: Summary: Redirect spark-ec2 doc to new location Key: SPARK-14742 URL: https://issues.apache.org/jira/browse/SPARK-14742 Project: Spark Issue Type:

[jira] [Commented] (SPARK-8327) Ganglia failed to start while starting standalone on EC 2 spark with spark-ec2

2016-04-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249144#comment-15249144 ] Nicholas Chammas commented on SPARK-8327: - [~vvladymyrov] - Is this still an issue? If so, I

[jira] [Commented] (SPARK-6527) sc.binaryFiles can not access files on s3

2016-04-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249141#comment-15249141 ] Nicholas Chammas commented on SPARK-6527: - Did the s3a suggestion work? If not, did anybody file

[jira] [Comment Edited] (SPARK-6527) sc.binaryFiles can not access files on s3

2016-04-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249141#comment-15249141 ] Nicholas Chammas edited comment on SPARK-6527 at 4/20/16 2:27 AM: -- Did

[jira] [Closed] (SPARK-3821) Develop an automated way of creating Spark images (AMI, Docker, and others)

2016-04-05 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas closed SPARK-3821. --- Resolution: Won't Fix I'm resolving this as "Won't Fix" due to lack of interest, both on my

[jira] [Commented] (SPARK-3533) Add saveAsTextFileByKey() method to RDDs

2016-03-28 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214577#comment-15214577 ] Nicholas Chammas commented on SPARK-3533: - I've added 2 workaround to this issue to the

[jira] [Updated] (SPARK-3533) Add saveAsTextFileByKey() method to RDDs

2016-03-28 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-3533: Description: Users often have a single RDD of key-value pairs that they want to save to

[jira] [Commented] (SPARK-7481) Add Hadoop 2.6+ profile to pull in object store FS accessors

2016-03-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197451#comment-15197451 ] Nicholas Chammas commented on SPARK-7481: - (Sorry Steve; can't comment on your proposal since I

[jira] [Commented] (SPARK-7505) Update PySpark DataFrame docs: encourage __getitem__, mark as experimental, etc.

2016-03-05 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181776#comment-15181776 ] Nicholas Chammas commented on SPARK-7505: - I believe items 1, 3, and 4 still apply. They're minor

[jira] [Commented] (SPARK-13596) Move misc top-level build files into appropriate subdirs

2016-03-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15180072#comment-15180072 ] Nicholas Chammas commented on SPARK-13596: -- Looks like {{tox.ini}} is only used by {{pep8}}, so

[jira] [Commented] (SPARK-7481) Add Hadoop 2.6+ profile to pull in object store FS accessors

2016-03-02 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176559#comment-15176559 ] Nicholas Chammas commented on SPARK-7481: - I'm not comfortable working with Maven so I can't

[jira] [Commented] (SPARK-7481) Add Hadoop 2.6+ profile to pull in object store FS accessors

2016-03-02 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176551#comment-15176551 ] Nicholas Chammas commented on SPARK-7481: - {quote} One issue here that hadoop 2.6's hadoop-aws

[jira] [Commented] (SPARK-7481) Add Hadoop 2.6+ profile to pull in object store FS accessors

2016-03-01 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174438#comment-15174438 ] Nicholas Chammas commented on SPARK-7481: - Many people seem to be downgrading to use Spark built

[jira] [Commented] (SPARK-5189) Reorganize EC2 scripts so that nodes can be provisioned independent of Spark master

2016-01-27 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15119220#comment-15119220 ] Nicholas Chammas commented on SPARK-5189: - FWIW, I found this issue to be practically unsolvable

[jira] [Commented] (SPARK-12824) Failure to maintain consistent RDD references in pyspark

2016-01-14 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098887#comment-15098887 ] Nicholas Chammas commented on SPARK-12824: -- Ah, good catch. This appears to be a known behavior

[jira] [Commented] (SPARK-12824) Failure to maintain consistent RDD references in pyspark

2016-01-14 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098336#comment-15098336 ] Nicholas Chammas commented on SPARK-12824: -- I can reproduce this issue. Here's a more concise

[jira] [Comment Edited] (SPARK-3821) Develop an automated way of creating Spark images (AMI, Docker, and others)

2015-12-18 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14203280#comment-14203280 ] Nicholas Chammas edited comment on SPARK-3821 at 12/18/15 9:08 PM: ---

<    1   2   3   4   5   6   7   8   9   10   >