[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636508#comment-15636508 ] Nicholas Chammas commented on SPARK-18128: -- [~prabinb] - See [this

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636497#comment-15636497 ] Nicholas Chammas commented on SPARK-18128: -- For the record: A PyPI admin is looking into the

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636500#comment-15636500 ] Nicholas Chammas commented on SPARK-18128: -- [~holdenk] - Shall we make this issue a subtask of

[jira] [Commented] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634455#comment-15634455 ] Nicholas Chammas commented on SPARK-18254: --    So it was specifically some broken

[jira] [Comment Edited] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634391#comment-15634391 ] Nicholas Chammas edited comment on SPARK-18254 at 11/3/16 9:58 PM: --- If

[jira] [Commented] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634428#comment-15634428 ] Nicholas Chammas commented on SPARK-18254: -- Just tried it. Seems like the fix is only available

[jira] [Commented] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634391#comment-15634391 ] Nicholas Chammas commented on SPARK-18254: -- If I try branch-2.1 on

[jira] [Comment Edited] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634391#comment-15634391 ] Nicholas Chammas edited comment on SPARK-18254 at 11/3/16 9:46 PM: --- If

[jira] [Commented] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15633744#comment-15633744 ] Nicholas Chammas commented on SPARK-18254: -- Interestingly, if I add {{names_cleaned.persist()}}

[jira] [Comment Edited] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15633446#comment-15633446 ] Nicholas Chammas edited comment on SPARK-18254 at 11/3/16 4:57 PM: ---

[jira] [Commented] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15633446#comment-15633446 ] Nicholas Chammas commented on SPARK-18254: -- Yes, if I don't alias the columns and/or update

[jira] [Updated] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-18254: - Description: Dunno if I'm misinterpreting something here, but this seems like a bug in

[jira] [Commented] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15633424#comment-15633424 ] Nicholas Chammas commented on SPARK-18254: -- Yep, it works fine if the column names haven't been

[jira] [Updated] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-18254: - Description: Dunno if I'm misinterpreting something here, but this seems like a bug in

[jira] [Updated] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-18254: - Description: Dunno if I'm misinterpreting something here, but this seems like a bug in

[jira] [Updated] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-18254: - Description: Dunno if I'm misinterpreting something here, but this seems like a bug in

[jira] [Updated] (SPARK-18254) UDFs don't see aliased column names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-18254: - Summary: UDFs don't see aliased column names (was: UDFs don't see aliased column names;

[jira] [Commented] (SPARK-18254) UDFs don't see aliased column names; somehow they get the original names

2016-11-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15633220#comment-15633220 ] Nicholas Chammas commented on SPARK-18254: -- [~marmbrus] / [~hvanhovell]: Is there a workaround

[jira] [Created] (SPARK-18254) UDFs don't see aliased column names; somehow they get the original names

2016-11-03 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-18254: Summary: UDFs don't see aliased column names; somehow they get the original names Key: SPARK-18254 URL: https://issues.apache.org/jira/browse/SPARK-18254

[jira] [Commented] (SPARK-16726) Improve `Union/Intersect/Except` error messages on incompatible types

2016-11-02 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15630597#comment-15630597 ] Nicholas Chammas commented on SPARK-16726: -- I just hit this error in 2.0.1 and it was this JIRA

[jira] [Commented] (SPARK-14900) spark.ml classification metrics should include accuracy

2016-10-29 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15618637#comment-15618637 ] Nicholas Chammas commented on SPARK-14900: -- I don't know if this belongs in a separate issue, or

[jira] [Commented] (SPARK-18084) write.partitionBy() does not recognize nested columns that select() can access

2016-10-25 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606484#comment-15606484 ] Nicholas Chammas commented on SPARK-18084: -- cc [~marmbrus] - Dunno if this is actually bug or

[jira] [Updated] (SPARK-18084) write.partitionBy() does not recognize nested columns that select() can access

2016-10-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-18084: - Issue Type: Bug (was: Improvement) > write.partitionBy() does not recognize nested

[jira] [Created] (SPARK-18084) write.partitionBy() does not recognize nested columns that select() can access

2016-10-24 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-18084: Summary: write.partitionBy() does not recognize nested columns that select() can access Key: SPARK-18084 URL: https://issues.apache.org/jira/browse/SPARK-18084

[jira] [Commented] (SPARK-12757) Use reference counting to prevent blocks from being evicted during reads

2016-10-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603211#comment-15603211 ] Nicholas Chammas commented on SPARK-12757: -- Just to link back, [~josephkb] is reporting that

[jira] [Closed] (SPARK-17976) Global options to spark-submit should not be position-sensitive

2016-10-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas closed SPARK-17976. Resolution: Not A Problem Ah, makes perfect sense. Would have realized that myself if I

[jira] [Created] (SPARK-17976) Global options to spark-submit should not be position-sensitive

2016-10-17 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-17976: Summary: Global options to spark-submit should not be position-sensitive Key: SPARK-17976 URL: https://issues.apache.org/jira/browse/SPARK-17976 Project:

[jira] [Comment Edited] (SPARK-14742) Redirect spark-ec2 doc to new location

2016-08-31 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15452150#comment-15452150 ] Nicholas Chammas edited comment on SPARK-14742 at 8/31/16 12:50 PM:

[jira] [Commented] (SPARK-14742) Redirect spark-ec2 doc to new location

2016-08-31 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15452150#comment-15452150 ] Nicholas Chammas commented on SPARK-14742: -- Sounds good to me. > Redirect spark-ec2 doc to new

[jira] [Commented] (SPARK-14742) Redirect spark-ec2 doc to new location

2016-08-30 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450626#comment-15450626 ] Nicholas Chammas commented on SPARK-14742: -- {quote} Otherwise the only way to get to this link

[jira] [Commented] (SPARK-14742) Redirect spark-ec2 doc to new location

2016-08-30 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450602#comment-15450602 ] Nicholas Chammas commented on SPARK-14742: -- http://spark.apache.org/docs/latest/ec2-scripts.html

[jira] [Updated] (SPARK-17220) Upgrade Py4J to 0.10.3

2016-08-26 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-17220: - Component/s: PySpark > Upgrade Py4J to 0.10.3 > -- > >

[jira] [Commented] (SPARK-14241) Output of monotonically_increasing_id lacks stable relation with rows of DataFrame

2016-08-25 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437616#comment-15437616 ] Nicholas Chammas commented on SPARK-14241: -- [~marmbrus] - Would it be tough to make this

[jira] [Commented] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428988#comment-15428988 ] Nicholas Chammas commented on SPARK-17025: -- {quote} We'd need to figure out a good design for

[jira] [Comment Edited] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417788#comment-15417788 ] Nicholas Chammas edited comment on SPARK-17025 at 8/11/16 7:33 PM: --- cc

[jira] [Comment Edited] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417788#comment-15417788 ] Nicholas Chammas edited comment on SPARK-17025 at 8/11/16 7:27 PM: --- cc

[jira] [Commented] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417788#comment-15417788 ] Nicholas Chammas commented on SPARK-17025: -- cc [~josephkb] [~mengxr] > Cannot persist PySpark

[jira] [Created] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-17025: Summary: Cannot persist PySpark ML Pipeline model that includes custom Transformer Key: SPARK-17025 URL: https://issues.apache.org/jira/browse/SPARK-17025

[jira] [Commented] (SPARK-16921) RDD/DataFrame persist() and cache() should return Python context managers

2016-08-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414067#comment-15414067 ] Nicholas Chammas commented on SPARK-16921: -- [~holdenk] - Probably won't be able to do it myself

[jira] [Created] (SPARK-16921) RDD/DataFrame persist() and cache() should return Python context managers

2016-08-05 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-16921: Summary: RDD/DataFrame persist() and cache() should return Python context managers Key: SPARK-16921 URL: https://issues.apache.org/jira/browse/SPARK-16921

[jira] [Closed] (SPARK-7505) Update PySpark DataFrame docs: encourage __getitem__, mark as experimental, etc.

2016-08-05 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas closed SPARK-7505. --- Resolution: Invalid Closing this as invalid as I believe these issues are no longer

[jira] [Commented] (SPARK-5312) Use sbt to detect new or changed public classes in PRs

2016-08-05 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409767#comment-15409767 ] Nicholas Chammas commented on SPARK-5312: - [~boyork] - Shall we close this? It doesn't look like

[jira] [Comment Edited] (SPARK-7146) Should ML sharedParams be a public API?

2016-08-02 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405300#comment-15405300 ] Nicholas Chammas edited comment on SPARK-7146 at 8/3/16 4:45 AM: - A quick

[jira] [Commented] (SPARK-7146) Should ML sharedParams be a public API?

2016-08-02 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405300#comment-15405300 ] Nicholas Chammas commented on SPARK-7146: - A quick update from a PySpark user: I am using

[jira] [Commented] (SPARK-16782) Use Sphinx autodoc to eliminate duplication of Python docstrings

2016-08-01 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402515#comment-15402515 ] Nicholas Chammas commented on SPARK-16782: -- Poking around a bit more, it seems like a possible

[jira] [Commented] (SPARK-16782) Use Sphinx autodoc to eliminate duplication of Python docstrings

2016-08-01 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402477#comment-15402477 ] Nicholas Chammas commented on SPARK-16782: -- Hmm never mind. I think I've misunderstood the

[jira] [Closed] (SPARK-16782) Use Sphinx autodoc to eliminate duplication of Python docstrings

2016-08-01 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas closed SPARK-16782. Resolution: Invalid > Use Sphinx autodoc to eliminate duplication of Python docstrings >

[jira] [Commented] (SPARK-12157) Support numpy types as return values of Python UDFs

2016-07-31 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401198#comment-15401198 ] Nicholas Chammas commented on SPARK-12157: -- OK. I've raised the issue of documenting this in

[jira] [Commented] (SPARK-16824) Add API docs for VectorUDT

2016-07-31 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401197#comment-15401197 ] Nicholas Chammas commented on SPARK-16824: -- cc [~josephkb] [~mengxr] - Should this type be

[jira] [Created] (SPARK-16824) Add API docs for VectorUDT

2016-07-31 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-16824: Summary: Add API docs for VectorUDT Key: SPARK-16824 URL: https://issues.apache.org/jira/browse/SPARK-16824 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-12157) Support numpy types as return values of Python UDFs

2016-07-31 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401179#comment-15401179 ] Nicholas Chammas commented on SPARK-12157: -- Thanks for the pointer, Maciej. It appears that

[jira] [Commented] (SPARK-12157) Support numpy types as return values of Python UDFs

2016-07-29 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399743#comment-15399743 ] Nicholas Chammas commented on SPARK-12157: -- It appears that it's not possible to have a UDF that

[jira] [Commented] (SPARK-12157) Support numpy types as return values of Python UDFs

2016-07-28 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398267#comment-15398267 ] Nicholas Chammas commented on SPARK-12157: -- I'm looking to define a UDF in PySpark that returns

[jira] [Commented] (SPARK-16782) Use Sphinx autodoc to eliminate duplication of Python docstrings

2016-07-28 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398059#comment-15398059 ] Nicholas Chammas commented on SPARK-16782: -- [~davies] [~joshrosen] - I can take this on if the

[jira] [Created] (SPARK-16782) Use Sphinx autodoc to eliminate duplication of Python docstrings

2016-07-28 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-16782: Summary: Use Sphinx autodoc to eliminate duplication of Python docstrings Key: SPARK-16782 URL: https://issues.apache.org/jira/browse/SPARK-16782 Project:

[jira] [Updated] (SPARK-16772) Correct API doc references to PySpark classes + formatting fixes

2016-07-28 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16772: - Summary: Correct API doc references to PySpark classes + formatting fixes (was: Correct

[jira] [Updated] (SPARK-16772) Correct API doc references to PySpark classes

2016-07-28 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16772: - Summary: Correct API doc references to PySpark classes (was: Correct API doc references

[jira] [Created] (SPARK-16772) Correct API doc references to DataType + other minor doc tweaks

2016-07-28 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-16772: Summary: Correct API doc references to DataType + other minor doc tweaks Key: SPARK-16772 URL: https://issues.apache.org/jira/browse/SPARK-16772 Project:

[jira] [Commented] (SPARK-7481) Add spark-cloud module to pull in aws+azure object store FS accessors; test integration

2016-07-22 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389589#comment-15389589 ] Nicholas Chammas commented on SPARK-7481: - [~ste...@apache.org] - Some relevant reading for you

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-07-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384535#comment-15384535 ] Nicholas Chammas commented on SPARK-12661: -- OK, sounds good to me. > Drop Python 2.6 support in

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-07-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384513#comment-15384513 ] Nicholas Chammas commented on SPARK-12661: -- Yes, I mean communicating our intention to drop

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-07-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384343#comment-15384343 ] Nicholas Chammas commented on SPARK-12661: -- To clarify what I mean by drop vs. deprecate,

[jira] [Comment Edited] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-07-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384343#comment-15384343 ] Nicholas Chammas edited comment on SPARK-12661 at 7/19/16 3:23 PM: --- To

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-07-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384339#comment-15384339 ] Nicholas Chammas commented on SPARK-12661: -- Just double-checking on something: Is it OK to drop

[jira] [Closed] (SPARK-16427) Expand documentation on the various RDD storage levels

2016-07-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas closed SPARK-16427. Resolution: Invalid > Expand documentation on the various RDD storage levels >

[jira] [Commented] (SPARK-16427) Expand documentation on the various RDD storage levels

2016-07-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371321#comment-15371321 ] Nicholas Chammas commented on SPARK-16427: -- Oh nevermind, this information is all available

[jira] [Commented] (SPARK-16427) Expand documentation on the various RDD storage levels

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366711#comment-15366711 ] Nicholas Chammas commented on SPARK-16427: -- My first question about this would be, how many

[jira] [Updated] (SPARK-3181) Add Robust Regression Algorithm with Huber Estimator

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-3181: Component/s: (was: MLilb) MLlib > Add Robust Regression Algorithm with

[jira] [Updated] (SPARK-16156) RowMatrıx Covariance

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16156: - Component/s: (was: MLilb) MLlib > RowMatrıx Covariance >

[jira] [Updated] (SPARK-16074) Expose VectorUDT/MatrixUDT in a public API

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16074: - Component/s: (was: MLilb) MLlib > Expose VectorUDT/MatrixUDT in a

[jira] [Updated] (SPARK-16377) Spark MLlib: MultilayerPerceptronClassifier - error while training

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16377: - Component/s: (was: MLilb) MLlib > Spark MLlib:

[jira] [Updated] (SPARK-16290) text type features column for classification

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16290: - Component/s: (was: MLilb) MLlib > text type features column for

[jira] [Updated] (SPARK-16232) Getting error by making columns using DataFrame

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16232: - Component/s: (was: MLilb) MLlib > Getting error by making columns

[jira] [Created] (SPARK-16427) Expand documentation on the various RDD storage levels

2016-07-07 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-16427: Summary: Expand documentation on the various RDD storage levels Key: SPARK-16427 URL: https://issues.apache.org/jira/browse/SPARK-16427 Project: Spark

[jira] [Commented] (SPARK-15760) Documentation missing for package-related config options

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366702#comment-15366702 ] Nicholas Chammas commented on SPARK-15760: -- Updating component since it seems we are using

[jira] [Updated] (SPARK-15441) dataset outer join seems to return incorrect result

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15441: - Component/s: (was: sq;) SQL > dataset outer join seems to return

[jira] [Updated] (SPARK-15772) Improve Scala API docs

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15772: - Component/s: (was: docs) > Improve Scala API docs > --- > >

[jira] [Updated] (SPARK-15760) Documentation missing for package-related config options

2016-07-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15760: - Component/s: (was: docs) Documentation > Documentation missing for

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2016-06-30 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357535#comment-15357535 ] Nicholas Chammas commented on SPARK-: - {quote} Python itself has no compile time type safety.

[jira] [Commented] (SPARK-11744) bin/pyspark --version doesn't return version and exit

2016-06-23 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15347297#comment-15347297 ] Nicholas Chammas commented on SPARK-11744: -- This is not the appropriate place to ask random

[jira] [Commented] (SPARK-3821) Develop an automated way of creating Spark images (AMI, Docker, and others)

2016-05-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292599#comment-15292599 ] Nicholas Chammas commented on SPARK-3821: - You can deploy Spark today on Docker just fine. It's

[jira] [Commented] (SPARK-3821) Develop an automated way of creating Spark images (AMI, Docker, and others)

2016-05-19 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292198#comment-15292198 ] Nicholas Chammas commented on SPARK-3821: - Not sure if there is renewed interest, but at this

[jira] [Commented] (SPARK-15072) Remove SparkSession.withHiveSupport

2016-05-16 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285112#comment-15285112 ] Nicholas Chammas commented on SPARK-15072: -- Brief note from [~yhuai] on the motivation behind

[jira] [Commented] (SPARK-10899) Support JDBC pushdown for additional commands

2016-05-12 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282239#comment-15282239 ] Nicholas Chammas commented on SPARK-10899: -- Is {{COUNT}} also something that can be pushed down?

[jira] [Comment Edited] (SPARK-7506) pyspark.sql.types.StructType.fromJson() is incorrectly named

2016-05-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280606#comment-15280606 ] Nicholas Chammas edited comment on SPARK-7506 at 5/11/16 6:51 PM: --

[jira] [Commented] (SPARK-7506) pyspark.sql.types.StructType.fromJson() is incorrectly named

2016-05-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280606#comment-15280606 ] Nicholas Chammas commented on SPARK-7506: - [~davies] - Would you be interested in a PR that adds

[jira] [Updated] (SPARK-15256) Clarify the docstring for DataFrameReader.jdbc()

2016-05-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15256: - Description: The doc for the {{properties}} parameter [currently

[jira] [Updated] (SPARK-15256) Clarify the docstring for DataFrameReader.jdbc()

2016-05-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15256: - Summary: Clarify the docstring for DataFrameReader.jdbc() (was: Correct the docstring

[jira] [Created] (SPARK-15256) Correct the docstring for DataFrameReader.jdbc()

2016-05-10 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-15256: Summary: Correct the docstring for DataFrameReader.jdbc() Key: SPARK-15256 URL: https://issues.apache.org/jira/browse/SPARK-15256 Project: Spark

[jira] [Comment Edited] (SPARK-15193) samplingRatio should default to 1.0 across the board

2016-05-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278393#comment-15278393 ] Nicholas Chammas edited comment on SPARK-15193 at 5/10/16 4:27 PM: ---

[jira] [Commented] (SPARK-15193) samplingRatio should default to 1.0 across the board

2016-05-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278393#comment-15278393 ] Nicholas Chammas commented on SPARK-15193: -- Nope, a sampling ratio of 1.0 and None mean

[jira] [Created] (SPARK-15238) Clarify Python 3 support in docs

2016-05-09 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-15238: Summary: Clarify Python 3 support in docs Key: SPARK-15238 URL: https://issues.apache.org/jira/browse/SPARK-15238 Project: Spark Issue Type:

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-05-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277450#comment-15277450 ] Nicholas Chammas commented on SPARK-12661: -- [~davies] / [~joshrosen] - Has this been settled on?

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2016-05-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277448#comment-15277448 ] Nicholas Chammas commented on SPARK-12661: -- [~shivaram] - Can you confirm that spark-ec2 will

[jira] [Commented] (SPARK-15204) Nullable is not correct for Aggregator

2016-05-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275394#comment-15275394 ] Nicholas Chammas commented on SPARK-15204: -- Loosely related: SPARK-15191 > Nullable is not

[jira] [Updated] (SPARK-15191) createDataFrame() should mark fields that are known not to be null as not nullable

2016-05-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-15191: - Affects Version/s: 1.6.1 > createDataFrame() should mark fields that are known not to be

[jira] [Commented] (SPARK-15191) createDataFrame() should mark fields that are known not to be null as not nullable

2016-05-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274785#comment-15274785 ] Nicholas Chammas commented on SPARK-15191: -- [~yhuai] - This loosely relates to the discussion in

[jira] [Commented] (SPARK-15193) samplingRatio should default to 1.0 across the board

2016-05-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274783#comment-15274783 ] Nicholas Chammas commented on SPARK-15193: -- [~yhuai] - What do you think of this proposed

[jira] [Created] (SPARK-15193) samplingRatio should default to 1.0 across the board

2016-05-06 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-15193: Summary: samplingRatio should default to 1.0 across the board Key: SPARK-15193 URL: https://issues.apache.org/jira/browse/SPARK-15193 Project: Spark

[jira] [Created] (SPARK-15191) createDataFrame() should mark fields that are known not to be null as not nullable

2016-05-06 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-15191: Summary: createDataFrame() should mark fields that are known not to be null as not nullable Key: SPARK-15191 URL: https://issues.apache.org/jira/browse/SPARK-15191

<    1   2   3   4   5   6   7   8   9   10   >