from:"Punya Biswal \(JIRA\)"

[jira] [Commented] (SPARK-6764) Add wheel package support for PySpark

2015-06-26 Thread Punya Biswal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-6764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603785#comment-14603785
 ] 

Punya Biswal commented on SPARK-6764:
-

Some packages need to be installed on workers, it's not enough just to put 
archived versions on the PYTHONPATH. Is there a reason to avoid using pip on 
the workers?

> Add wheel package support for PySpark
> -
>
> Key: SPARK-6764
> URL: https://issues.apache.org/jira/browse/SPARK-6764
> Project: Spark
>  Issue Type: Improvement
>  Components: Deploy, PySpark
>Reporter: Takao Magoori
>Priority: Minor
>  Labels: newbie
>
> We can do _spark-submit_ with one or more Python packages (.egg,.zip and 
> .jar) by *--py-files* option.
> h4. zip packaging
> Spark put a zip file on its working directory and adds the absolute path to 
> Python's sys.path. When the user program imports it, 
> [zipimport|https://docs.python.org/2.7/library/zipimport.html] is 
> automatically invoked under the hood. That is, data-files and dynamic 
> modules(.pyd .so) can not be used since zipimport supports only .py, .pyc and 
> .pyo.
> h4. egg packaging
> Spark put an egg file on its working directory and adds the absolute path to 
> Python's sys.path. Unlike zipimport, egg can handle data files and dynamid 
> modules as far as the author of the package uses [pkg_resources 
> API|https://pythonhosted.org/setuptools/formats.html#other-technical-considerations]
>  properly. But so many python modules does not use pkg_resources API, that 
> causes "ImportError"or "No such file" error. Moreover, creating eggs of 
> dependencies and further dependencies are troublesome job.
> h4. wheel packaging
> Supporting new Python standard package-format 
> "[wheel|https://wheel.readthedocs.org/en/latest/]"; would be nice. With wheel, 
> we can do spark-submit with complex dependencies simply as follows.
> 1. Write requirements.txt file.
> {noformat}
> SQLAlchemy
> MySQL-python
> requests
> simplejson>=3.6.0,<=3.6.5
> pydoop
> {noformat}
> 2. Do wheel packaging by only one command. All dependencies are wheel-ed.
> {noformat}
> $ your_pip_dir/pip wheel --wheel-dir /tmp/wheelhouse --requirement 
> requirements.txt
> {noformat}
> 3. Do spark-submit
> {noformat}
> your_spark_home/bin/spark-submit --master local[4] --py-files $(find 
> /tmp/wheelhouse/ -name "*.whl" -print0 | sed -e 's/\x0/,/g') your_driver.py
> {noformat}
> If your pyspark driver is a package which consists of many modules,
> 1. Write setup.py for your pyspark driver package.
> {noformat}
> from setuptools import (
> find_packages,
> setup,
> )
> setup(
> name='yourpkg',
> version='0.0.1',
> packages=find_packages(),
> install_requires=[
> 'SQLAlchemy',
> 'MySQL-python',
> 'requests',
> 'simplejson>=3.6.0,<=3.6.5',
> 'pydoop',
> ],
> )
> {noformat}
> 2. Do wheel packaging by only one command. Your driver package and all 
> dependencies are wheel-ed.
> {noformat}
> your_pip_dir/pip wheel --wheel-dir /tmp/wheelhouse your_driver_package/.
> {noformat}
> 3. Do spark-submit
> {noformat}
> your_spark_home/bin/spark-submit --master local[4] --py-files $(find 
> /tmp/wheelhouse/ -name "*.whl" -print0 | sed -e 's/\x0/,/g') 
> your_driver_bootstrap.py
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-8397) Allow custom configuration for TestHive

2015-06-16 Thread Punya Biswal (JIRA)

Punya Biswal created SPARK-8397:
---

 Summary: Allow custom configuration for TestHive
 Key: SPARK-8397
 URL: https://issues.apache.org/jira/browse/SPARK-8397
 Project: Spark
  Issue Type: Improvement
Affects Versions: 1.4.0
Reporter: Punya Biswal
Priority: Minor


We encourage people to use {{TestHive}} in unit tests, because it's impossible 
to create more than one {{HiveContext}} within one process. The current 
implementation locks people into using a {{local[2]}} {{SparkContext}} 
underlying their {{HiveContext}}. We should make it possible to override this 
using a system property so that people can test against {{local-cluster}} or 
remote spark clusters to make their tests more realistic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-7515) Update documentation for PySpark on YARN with cluster mode

2015-06-16 Thread Punya Biswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Punya Biswal updated SPARK-7515:

Fix Version/s: 1.4.1

> Update documentation for PySpark on YARN with cluster mode
> --
>
> Key: SPARK-7515
> URL: https://issues.apache.org/jira/browse/SPARK-7515
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.4.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
> Fix For: 1.4.1, 1.5.0
>
>
> Now PySpark on YARN with cluster mode is supported so let's update doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-7899) PySpark sql/tests breaks pylint validation

2015-05-27 Thread Punya Biswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Punya Biswal updated SPARK-7899:

Description: 
The pyspark.sql.types module is dynamically named {{types}} from {{_types}} 
which messes up pylint validation

>From [~justin.uang] below:

In commit 04e44b37, the migration to Python 3, {{pyspark/sql/types.py}} was 
renamed to {{pyspark/sql/\_types.py}} and then some magic in 
{{pyspark/sql/\_\_init\_\_.py}} dynamically renamed the module back to 
{{types}}. I imagine that this is some naming conflict with Python 3, but what 
was the error that showed up?

The reason why I'm asking about this is because it's messing with pylint, since 
pylint cannot now statically find the module. I tried also importing the 
package so that {{\_\_init\_\_}} would be run in a init-hook, but that isn't 
what the discovery mechanism is using. I imagine it's probably just crawling 
the directory structure.

One way to work around this would be something akin to this 
(http://stackoverflow.com/questions/9602811/how-to-tell-pylint-to-ignore-certain-imports),
 where I would have to create a fake module, but I would probably be missing a 
ton of pylint features on users of that module, and it's pretty hacky.

  was:
The pyspark.sql.types module is dynamically named "types" from "_types" which 
messes up pylint validation

>From [~justin.uang] below:

In commit 04e44b37, the migration to Python 3, pyspark/sql/types.py was renamed 
to pyspark/sql/_types.py and then some magic in pyspark/sql/__init__.py 
dynamically renamed the module back to types. I imagine that this is some 
naming conflict with Python 3, but what was the error that showed up?

The reason why I'm asking about this is because it's messing with pylint, since 
pylint cannot now statically find the module. I tried also importing the 
package so that __init__ would be run in a init-hook, but that isn't what the 
discovery mechanism is using. I imagine it's probably just crawling the 
directory structure.

One way to work around this would be something akin to this 
(http://stackoverflow.com/questions/9602811/how-to-tell-pylint-to-ignore-certain-imports),
 where I would have to create a fake module, but I would probably be missing a 
ton of pylint features on users of that module, and it's pretty hacky.


> PySpark sql/tests breaks pylint validation
> --
>
> Key: SPARK-7899
> URL: https://issues.apache.org/jira/browse/SPARK-7899
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, Tests
>Affects Versions: 1.4.0
>Reporter: Michael Nazario
>
> The pyspark.sql.types module is dynamically named {{types}} from {{_types}} 
> which messes up pylint validation
> From [~justin.uang] below:
> In commit 04e44b37, the migration to Python 3, {{pyspark/sql/types.py}} was 
> renamed to {{pyspark/sql/\_types.py}} and then some magic in 
> {{pyspark/sql/\_\_init\_\_.py}} dynamically renamed the module back to 
> {{types}}. I imagine that this is some naming conflict with Python 3, but 
> what was the error that showed up?
> The reason why I'm asking about this is because it's messing with pylint, 
> since pylint cannot now statically find the module. I tried also importing 
> the package so that {{\_\_init\_\_}} would be run in a init-hook, but that 
> isn't what the discovery mechanism is using. I imagine it's probably just 
> crawling the directory structure.
> One way to work around this would be something akin to this 
> (http://stackoverflow.com/questions/9602811/how-to-tell-pylint-to-ignore-certain-imports),
>  where I would have to create a fake module, but I would probably be missing 
> a ton of pylint features on users of that module, and it's pretty hacky.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-6907) Create an isolated classloader for the Hive Client.

2015-04-30 Thread Punya Biswal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522573#comment-14522573
 ] 

Punya Biswal commented on SPARK-6907:
-

Makes sense, thanks for clarifying. I guess a weaker version of my question is, 
can we write this in Java (rather than Scala) to set it up for future 
separation?

> Create an isolated classloader for the Hive Client.
> ---
>
> Key: SPARK-6907
> URL: https://issues.apache.org/jira/browse/SPARK-6907
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Michael Armbrust
>Assignee: Michael Armbrust
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-6907) Create an isolated classloader for the Hive Client.

2015-04-27 Thread Punya Biswal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14515066#comment-14515066
 ] 

Punya Biswal commented on SPARK-6907:
-

Would it make sense to do this as a separate project (repository)? It seems 
like a generic problem that's applicable more broadly than just Spark.

> Create an isolated classloader for the Hive Client.
> ---
>
> Key: SPARK-6907
> URL: https://issues.apache.org/jira/browse/SPARK-6907
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Michael Armbrust
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-7175) Upgrade Hive to 1.1.0

2015-04-27 Thread Punya Biswal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14514861#comment-14514861
 ] 

Punya Biswal commented on SPARK-7175:
-

[~pwendell] and [~vanzin] explained to me that this is quite hard to do at 
present, and pointed me to SPARK-6906. I'm leaving this ticket open for now, to 
revisit once the necessary architectural improvements have been made.

> Upgrade Hive to 1.1.0
> -
>
> Key: SPARK-7175
> URL: https://issues.apache.org/jira/browse/SPARK-7175
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: SQL
>Affects Versions: 1.3.1
>Reporter: Punya Biswal
>
> Spark SQL currently supports Hive 0.13 (June 2014), but the latest version of 
> Hive is 1.1.0 (March 2015). Among other improvements, it includes new UDFs 
> for date manipulation that I'd like to avoid rebuilding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-7175) Upgrade Hive to 1.1.0

2015-04-27 Thread Punya Biswal (JIRA)

Punya Biswal created SPARK-7175:
---

 Summary: Upgrade Hive to 1.1.0
 Key: SPARK-7175
 URL: https://issues.apache.org/jira/browse/SPARK-7175
 Project: Spark
  Issue Type: Dependency upgrade
  Components: SQL
Affects Versions: 1.3.1
Reporter: Punya Biswal


Spark SQL currently supports Hive 0.13 (June 2014), but the latest version of 
Hive is 1.1.0 (March 2015). Among other improvements, it includes new UDFs for 
date manipulation that I'd like to avoid rebuilding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-6996) DataFrame should support map types when creating DFs from JavaBeans.

2015-04-19 Thread Punya Biswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-6996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Punya Biswal updated SPARK-6996:

Description: 
If we have a JavaBean class with fields of map types, SQL throws an exception 
in {{createDataFrame}} because those types are not matched in 
{{SQLContext#inferDataType}}.

Similar to SPARK-6475.

  was:
If we have a JavaBean class with fields of collection or map types, SQL throws 
an exception in {{createDataFrame}} because those types are not matched in 
{{SQLContext#inferDataType}}.

Similar to SPARK-6475.

Summary: DataFrame should support map types when creating DFs from 
JavaBeans.  (was: DataFrame should support collection types when creating DFs 
from JavaBeans.)

> DataFrame should support map types when creating DFs from JavaBeans.
> 
>
> Key: SPARK-6996
> URL: https://issues.apache.org/jira/browse/SPARK-6996
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Punya Biswal
>
> If we have a JavaBean class with fields of map types, SQL throws an exception 
> in {{createDataFrame}} because those types are not matched in 
> {{SQLContext#inferDataType}}.
> Similar to SPARK-6475.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-6996) DataFrame should support collection types when creating DFs from JavaBeans.

2015-04-19 Thread Punya Biswal (JIRA)

Punya Biswal created SPARK-6996:
---

 Summary: DataFrame should support collection types when creating 
DFs from JavaBeans.
 Key: SPARK-6996
 URL: https://issues.apache.org/jira/browse/SPARK-6996
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Reporter: Punya Biswal


If we have a JavaBean class with fields of collection or map types, SQL throws 
an exception in {{createDataFrame}} because those types are not matched in 
{{SQLContext#inferDataType}}.

Similar to SPARK-6475.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-6475) DataFrame should support array types when creating DFs from JavaBeans.

2015-04-18 Thread Punya Biswal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14501609#comment-14501609
 ] 

Punya Biswal commented on SPARK-6475:
-

Would it be reasonable to recognize Java iterables and maps as well? I'd be 
happy to work on a PR if that seems like a good idea.

> DataFrame should support array types when creating DFs from JavaBeans.
> --
>
> Key: SPARK-6475
> URL: https://issues.apache.org/jira/browse/SPARK-6475
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
> Fix For: 1.4.0
>
>
> If we have a JavaBean class with array fields, SQL throws an exception in 
> `createDataFrame` because arrays are not matched in `getSchema` from a 
> JavaBean class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-6952) spark-daemon.sh PID reuse check fails on long classpath

2015-04-17 Thread Punya Biswal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499695#comment-14499695
 ] 

Punya Biswal commented on SPARK-6952:
-

Would it be reasonable to back port this to branch-1.3 or is it too late for 
that?

> spark-daemon.sh PID reuse check fails on long classpath
> ---
>
> Key: SPARK-6952
> URL: https://issues.apache.org/jira/browse/SPARK-6952
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy
>Affects Versions: 1.3.0
>Reporter: Punya Biswal
>Assignee: Punya Biswal
>Priority: Minor
> Fix For: 1.4.0
>
>
> {{sbin/spark-daemon.sh}} uses {{ps -p "$TARGET_PID" -o args=}} to figure out 
> whether the process running with the expected PID is actually a Spark daemon. 
> When running with a large classpath, the output of {{ps}} gets truncated and 
> the check fails spuriously.
> I think we should weaken the check to see if it's a java command (which is 
> something we do in other parts of the script) rather than looking for the 
> specific main class name. This means that SPARK-4832 might happen under a 
> slightly broader range of circumstances (a *java* program happened to reuse 
> the same PID), but it seems worthwhile compared to failing consistently with 
> a large classpath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-6940) PySpark ML.Tuning Wrappers are missing

2015-04-16 Thread Punya Biswal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498930#comment-14498930
 ] 

Punya Biswal commented on SPARK-6940:
-

Sorry about the duplicate bug - [~omede] and I were talking about the issue 
offline and we managed to step on each other's toes.

> PySpark ML.Tuning Wrappers are missing
> --
>
> Key: SPARK-6940
> URL: https://issues.apache.org/jira/browse/SPARK-6940
> Project: Spark
>  Issue Type: Improvement
>  Components: ML, PySpark
>Affects Versions: 1.3.0
>Reporter: Omede Firouz
>
> PySpark doesn't currently have wrappers for any of the ML.Tuning classes: 
> CrossValidator, CrossValidatorModel, ParamGridBuilder



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-6952) spark-daemon.sh PID reuse check fails on long classpath

2015-04-15 Thread Punya Biswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Punya Biswal updated SPARK-6952:

Summary: spark-daemon.sh PID reuse check fails on long classpath  (was: 
spark-daemon.sh fails on long classpath)

> spark-daemon.sh PID reuse check fails on long classpath
> ---
>
> Key: SPARK-6952
> URL: https://issues.apache.org/jira/browse/SPARK-6952
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy
>Affects Versions: 1.3.0
>Reporter: Punya Biswal
>
> {{sbin/spark-daemon.sh}} uses {{ps -p "$TARGET_PID" -o args=}} to figure out 
> whether the process running with the expected PID is actually a Spark daemon. 
> When running with a large classpath, the output of {{ps}} gets truncated and 
> the check fails spuriously.
> I think we should weaken the check to see if it's a java command (which is 
> something we do in other parts of the script) rather than looking for the 
> specific main class name. This means that SPARK-4832 might happen under a 
> slightly broader range of circumstances (a *java* program happened to reuse 
> the same PID), but it seems worthwhile compared to failing consistently with 
> a large classpath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-6952) spark-daemon.sh fails on long classpath

2015-04-15 Thread Punya Biswal (JIRA)

Punya Biswal created SPARK-6952:
---

 Summary: spark-daemon.sh fails on long classpath
 Key: SPARK-6952
 URL: https://issues.apache.org/jira/browse/SPARK-6952
 Project: Spark
  Issue Type: Bug
  Components: Deploy
Affects Versions: 1.3.0
Reporter: Punya Biswal


{{sbin/spark-daemon.sh}} uses {{ps -p "$TARGET_PID" -o args=}} to figure out 
whether the process running with the expected PID is actually a Spark daemon. 
When running with a large classpath, the output of {{ps}} gets truncated and 
the check fails spuriously.

I think we should weaken the check to see if it's a java command (which is 
something we do in other parts of the script) rather than looking for the 
specific main class name. This means that SPARK-4832 might happen under a 
slightly broader range of circumstances (a *java* program happened to reuse the 
same PID), but it seems worthwhile compared to failing consistently with a 
large classpath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-6947) Make ml.tuning accessible from Python API

2015-04-15 Thread Punya Biswal (JIRA)

Punya Biswal created SPARK-6947:
---

 Summary: Make ml.tuning accessible from Python API
 Key: SPARK-6947
 URL: https://issues.apache.org/jira/browse/SPARK-6947
 Project: Spark
  Issue Type: Improvement
  Components: ML, PySpark
Affects Versions: 1.3.0
Reporter: Punya Biswal


{{CrossValidator}} and {{ParamGridBuilder}} should be available for use in 
PySpark-based ML pipelines.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-6731) Upgrade Apache commons-math3 to 3.4.1

2015-04-06 Thread Punya Biswal (JIRA)

Punya Biswal created SPARK-6731:
---

 Summary: Upgrade Apache commons-math3 to 3.4.1
 Key: SPARK-6731
 URL: https://issues.apache.org/jira/browse/SPARK-6731
 Project: Spark
  Issue Type: Dependency upgrade
  Components: Spark Core
Affects Versions: 1.3.0
Reporter: Punya Biswal


Spark depends on Apache commons-math3 version 3.1.1, which is 2 years old. The 
current version (3.4.1) includes approximate percentile statistics (among other 
things).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-6764) Add wheel package support for PySpark

[jira] [Created] (SPARK-8397) Allow custom configuration for TestHive

[jira] [Updated] (SPARK-7515) Update documentation for PySpark on YARN with cluster mode

[jira] [Updated] (SPARK-7899) PySpark sql/tests breaks pylint validation

[jira] [Commented] (SPARK-6907) Create an isolated classloader for the Hive Client.

[jira] [Commented] (SPARK-6907) Create an isolated classloader for the Hive Client.

[jira] [Commented] (SPARK-7175) Upgrade Hive to 1.1.0

[jira] [Created] (SPARK-7175) Upgrade Hive to 1.1.0

[jira] [Updated] (SPARK-6996) DataFrame should support map types when creating DFs from JavaBeans.

[jira] [Created] (SPARK-6996) DataFrame should support collection types when creating DFs from JavaBeans.

[jira] [Commented] (SPARK-6475) DataFrame should support array types when creating DFs from JavaBeans.

[jira] [Commented] (SPARK-6952) spark-daemon.sh PID reuse check fails on long classpath

[jira] [Commented] (SPARK-6940) PySpark ML.Tuning Wrappers are missing

[jira] [Updated] (SPARK-6952) spark-daemon.sh PID reuse check fails on long classpath

[jira] [Created] (SPARK-6952) spark-daemon.sh fails on long classpath

[jira] [Created] (SPARK-6947) Make ml.tuning accessible from Python API

[jira] [Created] (SPARK-6731) Upgrade Apache commons-math3 to 3.4.1

17 matches

Site Navigation

Mail list logo

Footer information