date:20140405

[jira] [Created] (SPARK-1420) The maven build error for Spark Catalyst

2014-04-05 Thread witgo (JIRA)

witgo created SPARK-1420:


 Summary: The maven build error for Spark Catalyst
 Key: SPARK-1420
 URL: https://issues.apache.org/jira/browse/SPARK-1420
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: witgo






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-1420) The maven build error for Spark Catalyst

2014-04-05 Thread witgo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961126#comment-13961126
 ] 

witgo commented on SPARK-1420:
--

{code}
mvn -Pyarn -Dhadoop.version=2.3.0 -Dyarn.version=2.3.0 -DskipTests install
{code} 
=>
{code} 
[ERROR] 
/Users/witgo/work/code/java/spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala:31:
 object runtime is not a member of package reflect
[ERROR]   import scala.reflect.runtime.universe._
{code}

> The maven build error for Spark Catalyst
> 
>
> Key: SPARK-1420
> URL: https://issues.apache.org/jira/browse/SPARK-1420
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Reporter: witgo
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-1366) The sql function should be consistent between different types of SQLContext

2014-04-05 Thread Michael Armbrust (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Armbrust resolved SPARK-1366.
-

Resolution: Fixed

> The sql function should be consistent between different types of SQLContext
> ---
>
> Key: SPARK-1366
> URL: https://issues.apache.org/jira/browse/SPARK-1366
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Assignee: Michael Armbrust
>Priority: Blocker
> Fix For: 1.0.0
>
>
> Right now calling `context.sql` will cause things to be parsed with different 
> parsers, which is kinda confusing. Instead HiveContext should have a 
> specialized `hiveql` method that uses the HiveQL parser.
> Also need to update the documentation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-1421) Make MLlib work on Python 2.6 and NumPy < 1.7

2014-04-05 Thread Matei Zaharia (JIRA)

Matei Zaharia created SPARK-1421:


 Summary: Make MLlib work on Python 2.6 and NumPy < 1.7
 Key: SPARK-1421
 URL: https://issues.apache.org/jira/browse/SPARK-1421
 Project: Spark
  Issue Type: Bug
Reporter: Matei Zaharia


Currently it requires Python 2.7 and newer NumPy because it uses some new APIs, 
but they should not be essential for running our code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-1421) Make MLlib work on Python 2.6 and NumPy < 1.7

2014-04-05 Thread Matei Zaharia (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia updated SPARK-1421:
-

Affects Version/s: 0.9.1
   0.9.0

> Make MLlib work on Python 2.6 and NumPy < 1.7
> -
>
> Key: SPARK-1421
> URL: https://issues.apache.org/jira/browse/SPARK-1421
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib, PySpark
>Affects Versions: 0.9.0, 0.9.1
>Reporter: Matei Zaharia
>
> Currently it requires Python 2.7 and newer NumPy because it uses some new 
> APIs, but they should not be essential for running our code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-1421) Make MLlib work on Python 2.6 and NumPy < 1.7

2014-04-05 Thread Matei Zaharia (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia updated SPARK-1421:
-

Component/s: PySpark
 MLlib

> Make MLlib work on Python 2.6 and NumPy < 1.7
> -
>
> Key: SPARK-1421
> URL: https://issues.apache.org/jira/browse/SPARK-1421
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib, PySpark
>Affects Versions: 0.9.0, 0.9.1
>Reporter: Matei Zaharia
>
> Currently it requires Python 2.7 and newer NumPy because it uses some new 
> APIs, but they should not be essential for running our code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-1423) Add scripts for launching Spark on Windows Azure

2014-04-05 Thread Matei Zaharia (JIRA)

Matei Zaharia created SPARK-1423:


 Summary: Add scripts for launching Spark on Windows Azure
 Key: SPARK-1423
 URL: https://issues.apache.org/jira/browse/SPARK-1423
 Project: Spark
  Issue Type: Improvement
Reporter: Matei Zaharia






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-1422) Add scripts for launching Spark on Google Compute Engine

2014-04-05 Thread Matei Zaharia (JIRA)

Matei Zaharia created SPARK-1422:


 Summary: Add scripts for launching Spark on Google Compute Engine
 Key: SPARK-1422
 URL: https://issues.apache.org/jira/browse/SPARK-1422
 Project: Spark
  Issue Type: Improvement
  Components: EC2
Reporter: Matei Zaharia






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (SPARK-1309) sbt assemble-deps no longer works

2014-04-05 Thread Aaron Davidson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Davidson reassigned SPARK-1309:
-

Assignee: Aaron Davidson

> sbt assemble-deps no longer works
> -
>
> Key: SPARK-1309
> URL: https://issues.apache.org/jira/browse/SPARK-1309
> Project: Spark
>  Issue Type: New Feature
>  Components: Build
>Affects Versions: 1.0.0
>Reporter: Shivaram Venkataraman
>Assignee: Aaron Davidson
>Priority: Blocker
> Fix For: 1.0.0
>
>
> After the Catalyst merge the sbt assemble-deps workflow no longer works. Here 
> are the steps to reproduce
> sbt/sbt clean
> sbt/sbt assemble-deps
> ./bin/spark-shell
> Error: Could not find or load main class org.apache.spark.repl.Main
> The error comes from the fact that compute-classpath.sh does not include the 
> class files if the hive assembly jar is found.
> One fix would be to not build the hive assembly jar when assemble-deps is 
> called.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-1393) fix computePreferredLocations signature to not depend on underlying implementation

2014-04-05 Thread Mridul Muralidharan (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961278#comment-13961278
 ] 

Mridul Muralidharan commented on SPARK-1393:


Merged https://github.com/apache/spark/pull/302

> fix computePreferredLocations signature to not depend on underlying 
> implementation
> --
>
> Key: SPARK-1393
> URL: https://issues.apache.org/jira/browse/SPARK-1393
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
> Environment: All
>Reporter: Mridul Muralidharan
> Fix For: 1.0.0
>
>
> computePreferredLocations in 
> core/src/main/scala/org/apache/spark/scheduler/InputFormatInfo.scala : change 
> from using mutable HashMap/HashSet to Map/Set



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-1393) fix computePreferredLocations signature to not depend on underlying implementation

2014-04-05 Thread Mridul Muralidharan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan resolved SPARK-1393.


Resolution: Fixed

> fix computePreferredLocations signature to not depend on underlying 
> implementation
> --
>
> Key: SPARK-1393
> URL: https://issues.apache.org/jira/browse/SPARK-1393
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
> Environment: All
>Reporter: Mridul Muralidharan
> Fix For: 1.0.0
>
>
> computePreferredLocations in 
> core/src/main/scala/org/apache/spark/scheduler/InputFormatInfo.scala : change 
> from using mutable HashMap/HashSet to Map/Set



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-1424) InsertInto should work on JavaSchemaRDD as well.

2014-04-05 Thread Michael Armbrust (JIRA)

Michael Armbrust created SPARK-1424:
---

 Summary: InsertInto should work on JavaSchemaRDD as well.
 Key: SPARK-1424
 URL: https://issues.apache.org/jira/browse/SPARK-1424
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.0.0
Reporter: Michael Armbrust
Assignee: Michael Armbrust
Priority: Blocker






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-1424) InsertInto should work on JavaSchemaRDD as well.

2014-04-05 Thread Matei Zaharia (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961283#comment-13961283
 ] 

Matei Zaharia commented on SPARK-1424:
--

More generally we should have flags to support the following:
* Inserting data into an existing table
* Creating a new table, only if it does not exist
* Overwriting an existing table

> InsertInto should work on JavaSchemaRDD as well.
> 
>
> Key: SPARK-1424
> URL: https://issues.apache.org/jira/browse/SPARK-1424
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 1.0.0
>Reporter: Michael Armbrust
>Assignee: Michael Armbrust
>Priority: Blocker
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-1425) PySpark can crash Executors if worker.py fails while serializing data

2014-04-05 Thread Matei Zaharia (JIRA)

Matei Zaharia created SPARK-1425:


 Summary: PySpark can crash Executors if worker.py fails while 
serializing data
 Key: SPARK-1425
 URL: https://issues.apache.org/jira/browse/SPARK-1425
 Project: Spark
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Matei Zaharia


The PythonRDD code that talks to the worker will keep calling stream.readInt() 
and allocating an array of that size. Unfortunately, if the worker gives it 
corrupted data, it will attempt to allocate a huge array and get an 
OutOfMemoryError. It would be better to use a different stream to give 
feedback, *or* only write an object out to the stream once it's been properly 
pickled to bytes or to a string.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-1421) Make MLlib work on Python 2.6

2014-04-05 Thread Matei Zaharia (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia updated SPARK-1421:
-

Description: Currently it requires Python 2.7 because it uses some new 
APIs, but they should not be essential for running our code.  (was: Currently 
it requires Python 2.7 and newer NumPy because it uses some new APIs, but they 
should not be essential for running our code.)

> Make MLlib work on Python 2.6
> -
>
> Key: SPARK-1421
> URL: https://issues.apache.org/jira/browse/SPARK-1421
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib, PySpark
>Affects Versions: 0.9.0, 0.9.1
>Reporter: Matei Zaharia
>
> Currently it requires Python 2.7 because it uses some new APIs, but they 
> should not be essential for running our code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-1421) Make MLlib work on Python 2.6

2014-04-05 Thread Matei Zaharia (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia updated SPARK-1421:
-

Summary: Make MLlib work on Python 2.6  (was: Make MLlib work on Python 2.6 
and NumPy < 1.7)

> Make MLlib work on Python 2.6
> -
>
> Key: SPARK-1421
> URL: https://issues.apache.org/jira/browse/SPARK-1421
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib, PySpark
>Affects Versions: 0.9.0, 0.9.1
>Reporter: Matei Zaharia
>
> Currently it requires Python 2.7 and newer NumPy because it uses some new 
> APIs, but they should not be essential for running our code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-1426) Make MLlib work with NumPy versions older than 1.7

2014-04-05 Thread Matei Zaharia (JIRA)

Matei Zaharia created SPARK-1426:


 Summary: Make MLlib work with NumPy versions older than 1.7
 Key: SPARK-1426
 URL: https://issues.apache.org/jira/browse/SPARK-1426
 Project: Spark
  Issue Type: Improvement
  Components: MLlib, PySpark
Reporter: Matei Zaharia


Currently it requires NumPy 1.7 due to using the copyto method 
(http://docs.scipy.org/doc/numpy/reference/generated/numpy.copyto.html) for 
extracting data out of an array, but we could add a fallback for older versions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (SPARK-1421) Make MLlib work on Python 2.6

2014-04-05 Thread Matei Zaharia (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia reassigned SPARK-1421:


Assignee: Matei Zaharia

> Make MLlib work on Python 2.6
> -
>
> Key: SPARK-1421
> URL: https://issues.apache.org/jira/browse/SPARK-1421
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib, PySpark
>Affects Versions: 0.9.0, 0.9.1
>Reporter: Matei Zaharia
>Assignee: Matei Zaharia
>
> Currently it requires Python 2.7 because it uses some new APIs, but they 
> should not be essential for running our code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-1421) Make MLlib work on Python 2.6

2014-04-05 Thread Matei Zaharia (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia resolved SPARK-1421.
--

   Resolution: Fixed
Fix Version/s: 0.9.2
   1.0.0

> Make MLlib work on Python 2.6
> -
>
> Key: SPARK-1421
> URL: https://issues.apache.org/jira/browse/SPARK-1421
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib, PySpark
>Affects Versions: 0.9.0, 0.9.1
>Reporter: Matei Zaharia
>Assignee: Matei Zaharia
> Fix For: 1.0.0, 0.9.2
>
>
> Currently it requires Python 2.7 because it uses some new APIs, but they 
> should not be essential for running our code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-1427) HQL Examples Don't Work

2014-04-05 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-1427:
--

 Summary: HQL Examples Don't Work
 Key: SPARK-1427
 URL: https://issues.apache.org/jira/browse/SPARK-1427
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.0.0
Reporter: Patrick Wendell
Assignee: Michael Armbrust
 Fix For: 1.0.0


{code}
scala> hql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)")
14/04/05 22:40:29 INFO ParseDriver: Parsing command: CREATE TABLE IF NOT EXISTS 
src (key INT, value STRING)
14/04/05 22:40:30 INFO ParseDriver: Parse Completed
14/04/05 22:40:30 INFO Driver: 
14/04/05 22:40:30 INFO Driver: 
14/04/05 22:40:30 INFO Driver: 
14/04/05 22:40:30 INFO Driver: 
14/04/05 22:40:30 INFO ParseDriver: Parsing command: CREATE TABLE IF NOT EXISTS 
src (key INT, value STRING)
14/04/05 22:40:30 INFO ParseDriver: Parse Completed
14/04/05 22:40:30 INFO Driver: 
14/04/05 22:40:30 INFO Driver: 
14/04/05 22:40:30 INFO SemanticAnalyzer: Starting Semantic Analysis
14/04/05 22:40:30 INFO SemanticAnalyzer: Creating table src position=27
14/04/05 22:40:30 INFO HiveMetaStore: 0: Opening raw store with implemenation 
class:org.apache.hadoop.hive.metastore.ObjectStore
14/04/05 22:40:30 INFO ObjectStore: ObjectStore, initialize called
14/04/05 22:40:30 INFO Persistence: Property datanucleus.cache.level2 unknown - 
will be ignored
14/04/05 22:40:30 WARN BoneCPConfig: Max Connections < 1. Setting to 20
14/04/05 22:40:32 INFO ObjectStore: Setting MetaStore object pin classes with 
hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
14/04/05 22:40:32 INFO ObjectStore: Initialized ObjectStore
14/04/05 22:40:33 WARN BoneCPConfig: Max Connections < 1. Setting to 20
14/04/05 22:40:33 INFO HiveMetaStore: 0: get_table : db=default tbl=src
14/04/05 22:40:33 INFO audit: ugi=patrick   ip=unknown-ip-addr  
cmd=get_table : db=default tbl=src  
14/04/05 22:40:33 INFO Datastore: The class 
"org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as 
"embedded-only" so does not have its own datastore table.
14/04/05 22:40:33 INFO Datastore: The class 
"org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so 
does not have its own datastore table.
14/04/05 22:40:34 INFO Driver: Semantic Analysis Completed
14/04/05 22:40:34 INFO Driver: 
14/04/05 22:40:34 INFO Driver: Returning Hive schema: Schema(fieldSchemas:null, 
properties:null)
14/04/05 22:40:34 INFO Driver: 
14/04/05 22:40:34 INFO Driver: 
14/04/05 22:40:34 INFO Driver: Starting command: CREATE TABLE IF NOT EXISTS src 
(key INT, value STRING)
14/04/05 22:40:34 INFO Driver: 
14/04/05 22:40:34 INFO Driver: 
14/04/05 22:40:34 INFO Driver: 
14/04/05 22:40:34 INFO Driver: 
14/04/05 22:40:34 INFO Driver: OK
14/04/05 22:40:34 INFO Driver: 
14/04/05 22:40:34 INFO Driver: 
14/04/05 22:40:34 INFO Driver: 
14/04/05 22:40:34 INFO Driver: 
14/04/05 22:40:34 INFO Driver: 
java.lang.AssertionError: assertion failed: No plan for NativeCommand CREATE 
TABLE IF NOT EXISTS src (key INT, value STRING)

at scala.Predef$.assert(Predef.scala:179)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.apply(QueryPlanner.scala:59)
at 
org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:218)
at 
org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:218)
at 
org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:219)
at 
org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:219)
at 
org.apache.spark.sql.SchemaRDDLike$class.toString(SchemaRDDLike.scala:44)
at org.apache.spark.sql.SchemaRDD.toString(SchemaRDD.scala:93)
at java.lang.String.valueOf(String.java:2854)
at scala.runtime.ScalaRunTime$.stringOf(ScalaRunTime.scala:331)
at scala.runtime.ScalaRunTime$.replStringOf(ScalaRunTime.scala:337)
at .(:10)
at .()
at $print()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
{code}

{code}
scala> hql("select count(*) from src")
14/04/05 22:47:13 INFO ParseDriver: Parsing command: select count(*) from src
14/04/05 22:47:13 INFO ParseDriver: Parse Completed
14/04/05 22:47:13 INFO HiveMetaStore: 0: get_table : db=default tbl=src
14/04/05 22:47:13 INFO audit: ugi=patrick   ip=unknown-ip-addr  
cmd=get_table : db=default tbl=src  
14/04/05 22:47:13 INFO MemoryStore: ensureFreeSpace(147107) called with 
curMem=0, maxMem=308713881
14/04/05 22:47:13 INFO MemoryStore: Block broadcast_0 stored as values to

[jira] [Created] (SPARK-1428) MLlib should convert non-float64 NumPy arrays to float64 instead of complaining

2014-04-05 Thread Matei Zaharia (JIRA)

Matei Zaharia created SPARK-1428:


 Summary: MLlib should convert non-float64 NumPy arrays to float64 
instead of complaining
 Key: SPARK-1428
 URL: https://issues.apache.org/jira/browse/SPARK-1428
 Project: Spark
  Issue Type: Improvement
  Components: MLlib, PySpark
Reporter: Matei Zaharia
Priority: Minor


Pretty easy to fix, it would avoid spewing some scary task-failed errors. The 
place to fix this is _serialize_double_vector in _common.py.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-1351) Documentation Improvements for Spark 1.0

2014-04-05 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-1351:
---

Description: 
Umbrella to track necessary doc improvements. We can break these out into other 
JIRA's over time.

- Use grouping in the RDD and SparkContext scaladocs. See Schema RDD:
http://people.apache.org/~pwendell/catalyst-docs/api/sql/core/index.html#org.apache.spark.sql.SchemaRDD
- Use spark-submit script wherever possible in docs.
- Have package-level documentation in Scaladoc.

  was:
Umbrella to track necessary doc improvements. We can break these out into other 
JIRA's over time.

- Use spark-submit script wherever possible in docs.
- Have package-level documentation in Scaladoc.


> Documentation Improvements for Spark 1.0
> 
>
> Key: SPARK-1351
> URL: https://issues.apache.org/jira/browse/SPARK-1351
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Patrick Wendell
>Priority: Critical
> Fix For: 1.0.0
>
>
> Umbrella to track necessary doc improvements. We can break these out into 
> other JIRA's over time.
> - Use grouping in the RDD and SparkContext scaladocs. See Schema RDD:
> http://people.apache.org/~pwendell/catalyst-docs/api/sql/core/index.html#org.apache.spark.sql.SchemaRDD
> - Use spark-submit script wherever possible in docs.
> - Have package-level documentation in Scaladoc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-1420) The maven build error for Spark Catalyst

[jira] [Commented] (SPARK-1420) The maven build error for Spark Catalyst

[jira] [Resolved] (SPARK-1366) The sql function should be consistent between different types of SQLContext

[jira] [Created] (SPARK-1421) Make MLlib work on Python 2.6 and NumPy < 1.7

[jira] [Updated] (SPARK-1421) Make MLlib work on Python 2.6 and NumPy < 1.7

[jira] [Updated] (SPARK-1421) Make MLlib work on Python 2.6 and NumPy < 1.7

[jira] [Created] (SPARK-1423) Add scripts for launching Spark on Windows Azure

[jira] [Created] (SPARK-1422) Add scripts for launching Spark on Google Compute Engine

[jira] [Assigned] (SPARK-1309) sbt assemble-deps no longer works

[jira] [Commented] (SPARK-1393) fix computePreferredLocations signature to not depend on underlying implementation

[jira] [Resolved] (SPARK-1393) fix computePreferredLocations signature to not depend on underlying implementation

[jira] [Created] (SPARK-1424) InsertInto should work on JavaSchemaRDD as well.

[jira] [Commented] (SPARK-1424) InsertInto should work on JavaSchemaRDD as well.

[jira] [Created] (SPARK-1425) PySpark can crash Executors if worker.py fails while serializing data

[jira] [Updated] (SPARK-1421) Make MLlib work on Python 2.6

[jira] [Updated] (SPARK-1421) Make MLlib work on Python 2.6

[jira] [Created] (SPARK-1426) Make MLlib work with NumPy versions older than 1.7

[jira] [Assigned] (SPARK-1421) Make MLlib work on Python 2.6

[jira] [Resolved] (SPARK-1421) Make MLlib work on Python 2.6

[jira] [Created] (SPARK-1427) HQL Examples Don't Work

[jira] [Created] (SPARK-1428) MLlib should convert non-float64 NumPy arrays to float64 instead of complaining

[jira] [Updated] (SPARK-1351) Documentation Improvements for Spark 1.0

22 matches

Site Navigation

Mail list logo

Footer information