[jira] [Created] (SPARK-5262) coalesce should allow NullType and 1 another type in parameters

2015-01-15 Thread Adrian Wang (JIRA)
Adrian Wang created SPARK-5262: -- Summary: coalesce should allow NullType and 1 another type in parameters Key: SPARK-5262 URL: https://issues.apache.org/jira/browse/SPARK-5262 Project: Spark Is

[jira] [Commented] (SPARK-5262) coalesce should allow NullType and 1 another type in parameters

2015-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278416#comment-14278416 ] Apache Spark commented on SPARK-5262: - User 'adrian-wang' has created a pull request f

[jira] [Updated] (SPARK-1084) Fix most build warnings

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1084: -- Reporter: Sean Owen (was: Sean Owen) > Fix most build warnings > --- > >

[jira] [Updated] (SPARK-1181) 'mvn test' fails out of the box since sbt assembly does not necessarily exist

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1181: -- Reporter: Sean Owen (was: Sean Owen) > 'mvn test' fails out of the box since sbt assembly does

[jira] [Updated] (SPARK-1071) Tidy logging strategy and use of log4j

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1071: -- Reporter: Sean Owen (was: Sean Owen) > Tidy logging strategy and use of log4j > ---

[jira] [Updated] (SPARK-1254) Consolidate, order, and harmonize repository declarations in Maven/SBT builds

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1254: -- Reporter: Sean Owen (was: Sean Owen) > Consolidate, order, and harmonize repository declaration

[jira] [Updated] (SPARK-1335) Also increase perm gen / code cache for scalatest when invoked via Maven build

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1335: -- Reporter: Sean Owen (was: Sean Owen) > Also increase perm gen / code cache for scalatest when i

[jira] [Updated] (SPARK-1316) Remove use of Commons IO

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1316: -- Reporter: Sean Owen (was: Sean Owen) > Remove use of Commons IO > > >

[jira] [Updated] (SPARK-2341) loadLibSVMFile doesn't handle regression datasets

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-2341: -- Assignee: Sean Owen (was: Sean Owen) > loadLibSVMFile doesn't handle regression datasets >

[jira] [Updated] (SPARK-1071) Tidy logging strategy and use of log4j

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1071: -- Assignee: Sean Owen (was: Sean Owen) > Tidy logging strategy and use of log4j > ---

[jira] [Updated] (SPARK-2798) Correct several small errors in Flume module pom.xml files

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-2798: -- Assignee: Sean Owen (was: Sean Owen) > Correct several small errors in Flume module pom.xml fil

[jira] [Updated] (SPARK-1315) spark on yarn-alpha with mvn on master branch won't build

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1315: -- Assignee: Sean Owen (was: Sean Owen) > spark on yarn-alpha with mvn on master branch won't buil

[jira] [Updated] (SPARK-2879) Use HTTPS to access Maven Central and other repos

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-2879: -- Assignee: Sean Owen (was: Sean Owen) > Use HTTPS to access Maven Central and other repos >

[jira] [Updated] (SPARK-3803) ArrayIndexOutOfBoundsException found in executing computePrincipalComponents

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-3803: -- Assignee: Sean Owen (was: Sean Owen) > ArrayIndexOutOfBoundsException found in executing comput

[jira] [Updated] (SPARK-2749) Spark SQL Java tests aren't compiling in Jenkins' Maven builds; missing junit:junit dep

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-2749: -- Assignee: Sean Owen (was: Sean Owen) > Spark SQL Java tests aren't compiling in Jenkins' Maven

[jira] [Updated] (SPARK-1556) jets3t dep doesn't update properly with newer Hadoop versions

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1556: -- Assignee: Sean Owen (was: Sean Owen) > jets3t dep doesn't update properly with newer Hadoop ver

[jira] [Updated] (SPARK-1335) Also increase perm gen / code cache for scalatest when invoked via Maven build

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1335: -- Assignee: Sean Owen (was: Sean Owen) > Also increase perm gen / code cache for scalatest when i

[jira] [Updated] (SPARK-2768) Add product, user recommend method to MatrixFactorizationModel

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-2768: -- Assignee: Sean Owen (was: Sean Owen) > Add product, user recommend method to MatrixFactorizatio

[jira] [Updated] (SPARK-2748) Loss of precision for small arguments to Math.exp, Math.log

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-2748: -- Assignee: Sean Owen (was: Sean Owen) > Loss of precision for small arguments to Math.exp, Math.

[jira] [Updated] (SPARK-1973) Add randomSplit to JavaRDD (with tests, and tidy Java tests)

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1973: -- Assignee: Sean Owen (was: Sean Owen) > Add randomSplit to JavaRDD (with tests, and tidy Java te

[jira] [Updated] (SPARK-2745) Add Java friendly methods to Duration class

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-2745: -- Assignee: Sean Owen (was: Sean Owen) > Add Java friendly methods to Duration class > --

[jira] [Updated] (SPARK-1209) SparkHadoop{MapRed,MapReduce}Util should not use package org.apache.hadoop

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1209: -- Assignee: Sean Owen (was: Sean Owen) > SparkHadoop{MapRed,MapReduce}Util should not use package

[jira] [Updated] (SPARK-1316) Remove use of Commons IO

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1316: -- Assignee: Sean Owen (was: Sean Owen) > Remove use of Commons IO > > >

[jira] [Updated] (SPARK-4170) Closure problems when running Scala app that "extends App"

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-4170: -- Assignee: Sean Owen (was: Sean Owen) > Closure problems when running Scala app that "extends Ap

[jira] [Updated] (SPARK-1727) Correct small compile errors, typos, and markdown issues in (primarly) MLlib docs

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1727: -- Assignee: Sean Owen (was: Sean Owen) > Correct small compile errors, typos, and markdown issues

[jira] [Updated] (SPARK-1789) Multiple versions of Netty dependencies cause FlumeStreamSuite failure

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1789: -- Assignee: Sean Owen (was: Sean Owen) > Multiple versions of Netty dependencies cause FlumeStrea

[jira] [Updated] (SPARK-1802) Audit dependency graph when Spark is built with -Phive

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1802: -- Assignee: Sean Owen (was: Sean Owen) > Audit dependency graph when Spark is built with -Phive >

[jira] [Updated] (SPARK-1248) Spark build error with Apache Hadoop(Cloudera CDH4)

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1248: -- Assignee: Sean Owen (was: Sean Owen) > Spark build error with Apache Hadoop(Cloudera CDH4) > --

[jira] [Updated] (SPARK-1120) Send all dependency logging through slf4j

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1120: -- Assignee: Sean Owen (was: Sean Owen) > Send all dependency logging through slf4j >

[jira] [Updated] (SPARK-2363) Clean MLlib's sample data files

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-2363: -- Assignee: Sean Owen (was: Sean Owen) > Clean MLlib's sample data files > --

[jira] [Updated] (SPARK-1254) Consolidate, order, and harmonize repository declarations in Maven/SBT builds

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1254: -- Assignee: Sean Owen (was: Sean Owen) > Consolidate, order, and harmonize repository declaration

[jira] [Updated] (SPARK-1996) Remove use of special Maven repo for Akka

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1996: -- Assignee: Sean Owen (was: Sean Owen) > Remove use of special Maven repo for Akka >

[jira] [Updated] (SPARK-1827) LICENSE and NOTICE files need a refresh to contain transitive dependency info

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1827: -- Assignee: Sean Owen (was: Sean Owen) > LICENSE and NOTICE files need a refresh to contain trans

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2015-01-15 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278453#comment-14278453 ] Guoqiang Li commented on SPARK-1405: We can use the demo scripts in word2vec to get th

[jira] [Updated] (SPARK-1798) Tests should clean up temp files

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1798: -- Assignee: Sean Owen (was: Sean Owen) > Tests should clean up temp files > -

[jira] [Updated] (SPARK-3356) Document when RDD elements' ordering within partitions is nondeterministic

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-3356: -- Assignee: Sean Owen (was: Sean Owen) > Document when RDD elements' ordering within partitions i

[jira] [Updated] (SPARK-2955) Test code fails to compile with "mvn compile" without "install"

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-2955: -- Assignee: Sean Owen (was: Sean Owen) > Test code fails to compile with "mvn compile" without "i

[jira] [Updated] (SPARK-2034) KafkaInputDStream doesn't close resources and may prevent JVM shutdown

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-2034: -- Assignee: Sean Owen (was: Sean Owen) > KafkaInputDStream doesn't close resources and may preven

[jira] [Updated] (SPARK-1084) Fix most build warnings

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1084: -- Assignee: Sean Owen (was: Sean Owen) > Fix most build warnings > --- > >

[jira] [Updated] (SPARK-1663) Spark Streaming docs code has several small errors

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-1663: -- Assignee: Sean Owen (was: Sean Owen) > Spark Streaming docs code has several small errors > ---

[jira] [Updated] (SPARK-2602) sbt/sbt test steals window focus on OS X

2015-01-15 Thread Tony Stevenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Stevenson updated SPARK-2602: -- Assignee: Sean Owen (was: Sean Owen) > sbt/sbt test steals window focus on OS X > -

[jira] [Created] (SPARK-5263) `create table` DDL need to check if table exists first

2015-01-15 Thread shengli (JIRA)
shengli created SPARK-5263: -- Summary: `create table` DDL need to check if table exists first Key: SPARK-5263 URL: https://issues.apache.org/jira/browse/SPARK-5263 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-5263) `create table` DDL need to check if table exists first

2015-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278476#comment-14278476 ] Apache Spark commented on SPARK-5263: - User 'OopsOutOfMemory' has created a pull reque

[jira] [Created] (SPARK-5264) support `drop table` DDL command

2015-01-15 Thread shengli (JIRA)
shengli created SPARK-5264: -- Summary: support `drop table` DDL command Key: SPARK-5264 URL: https://issues.apache.org/jira/browse/SPARK-5264 Project: Spark Issue Type: Bug Components: SQL

[jira] [Commented] (SPARK-5243) Spark will hang if (driver memory + executor memory) exceeds limit on a 1-worker cluster

2015-01-15 Thread Takumi Yoshida (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278502#comment-14278502 ] Takumi Yoshida commented on SPARK-5243: --- Hi! I found, Spark hangs with following si

[jira] [Updated] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve

2015-01-15 Thread Vladimir Grigor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Grigor updated SPARK-5246: --- Description: ##How to reproduce: 1) http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VP

[jira] [Updated] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve

2015-01-15 Thread Vladimir Grigor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Grigor updated SPARK-5246: --- Description: How to reproduce: 1) http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC

[jira] [Commented] (SPARK-5012) Python API for Gaussian Mixture Model

2015-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278614#comment-14278614 ] Apache Spark commented on SPARK-5012: - User 'FlytxtRnD' has created a pull request for

[jira] [Commented] (SPARK-5264) support `drop table` DDL command

2015-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278639#comment-14278639 ] Apache Spark commented on SPARK-5264: - User 'OopsOutOfMemory' has created a pull reque

[jira] [Created] (SPARK-5265) Submitting applications on Standalone cluster controlled by Zookeeper forces to know active master

2015-01-15 Thread Roque Vassal'lo (JIRA)
Roque Vassal'lo created SPARK-5265: -- Summary: Submitting applications on Standalone cluster controlled by Zookeeper forces to know active master Key: SPARK-5265 URL: https://issues.apache.org/jira/browse/SPARK-52

[jira] [Created] (SPARK-5266) numExecutorsFailed should exclude number of killExecutor in yarn mode

2015-01-15 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-5266: --- Summary: numExecutorsFailed should exclude number of killExecutor in yarn mode Key: SPARK-5266 URL: https://issues.apache.org/jira/browse/SPARK-5266 Project: Spark

[jira] [Commented] (SPARK-5266) numExecutorsFailed should exclude number of killExecutor in yarn mode

2015-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278684#comment-14278684 ] Apache Spark commented on SPARK-5266: - User 'lianhuiwang' has created a pull request f

[jira] [Closed] (SPARK-5266) numExecutorsFailed should exclude number of killExecutor in yarn mode

2015-01-15 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang closed SPARK-5266. --- Resolution: Fixed > numExecutorsFailed should exclude number of killExecutor in yarn mode > --

[jira] [Commented] (SPARK-4943) Parsing error for query with table name having dot

2015-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278710#comment-14278710 ] Apache Spark commented on SPARK-4943: - User 'scwf' has created a pull request for this

[jira] [Created] (SPARK-5267) Add a streaming module to ingest Apache Camel Messages from a configured endpoints

2015-01-15 Thread Steve Brewin (JIRA)
Steve Brewin created SPARK-5267: --- Summary: Add a streaming module to ingest Apache Camel Messages from a configured endpoints Key: SPARK-5267 URL: https://issues.apache.org/jira/browse/SPARK-5267 Projec

[jira] [Commented] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve

2015-01-15 Thread Vladimir Grigor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278780#comment-14278780 ] Vladimir Grigor commented on SPARK-5246: https://github.com/mesos/spark-ec2/pull/9

[jira] [Created] (SPARK-5268) ExecutorBackend exits for irrelevant DisassociatedEvent

2015-01-15 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-5268: -- Summary: ExecutorBackend exits for irrelevant DisassociatedEvent Key: SPARK-5268 URL: https://issues.apache.org/jira/browse/SPARK-5268 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-5268) CoarseGrainedExecutorBackend exits for irrelevant DisassociatedEvent

2015-01-15 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-5268: --- Summary: CoarseGrainedExecutorBackend exits for irrelevant DisassociatedEvent (was: ExecutorBackend exits for

[jira] [Commented] (SPARK-5268) ExecutorBackend exits for irrelevant DisassociatedEvent

2015-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278786#comment-14278786 ] Apache Spark commented on SPARK-5268: - User 'CodingCat' has created a pull request for

[jira] [Created] (SPARK-5269) BlockManager.dataDeserialize always creates a new serializer instance

2015-01-15 Thread Ivan Vergiliev (JIRA)
Ivan Vergiliev created SPARK-5269: - Summary: BlockManager.dataDeserialize always creates a new serializer instance Key: SPARK-5269 URL: https://issues.apache.org/jira/browse/SPARK-5269 Project: Spark

[jira] [Commented] (SPARK-5097) Adding data frame APIs to SchemaRDD

2015-01-15 Thread Hamel Ajay Kothari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278819#comment-14278819 ] Hamel Ajay Kothari commented on SPARK-5097: --- Am I correct in interpreting that t

[jira] [Updated] (SPARK-5267) Add a streaming module to ingest Apache Camel Messages from a configured endpoints

2015-01-15 Thread Steve Brewin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Brewin updated SPARK-5267: Description: The number of input stream protocols supported by Spark Streaming is quite limited, wh

[jira] [Commented] (SPARK-5012) Python API for Gaussian Mixture Model

2015-01-15 Thread Travis Galoppo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278947#comment-14278947 ] Travis Galoppo commented on SPARK-5012: --- This will probably be affected by SPARK-501

[jira] [Created] (SPARK-5270) Elegantly check if RDD is empty

2015-01-15 Thread Al M (JIRA)
Al M created SPARK-5270: --- Summary: Elegantly check if RDD is empty Key: SPARK-5270 URL: https://issues.apache.org/jira/browse/SPARK-5270 Project: Spark Issue Type: Improvement Affects Versions: 1.2

[jira] [Updated] (SPARK-5270) Elegantly check if RDD is empty

2015-01-15 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Al M updated SPARK-5270: Description: Right now there is no clean way to check if an RDD is empty. As discussed here: http://apache-spark-

[jira] [Commented] (SPARK-5270) Elegantly check if RDD is empty

2015-01-15 Thread Al M (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278983#comment-14278983 ] Al M commented on SPARK-5270: - I just noticed that rdd.partitions.size is set to 0 for empty R

[jira] [Commented] (SPARK-5270) Elegantly check if RDD is empty

2015-01-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278993#comment-14278993 ] Sean Owen commented on SPARK-5270: -- I think it's conceivable to have an RDD with no eleme

[jira] [Commented] (SPARK-5185) pyspark --jars does not add classes to driver class path

2015-01-15 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279040#comment-14279040 ] Marcelo Vanzin commented on SPARK-5185: --- BTW I talked to Uri offline about this. The

[jira] [Commented] (SPARK-5097) Adding data frame APIs to SchemaRDD

2015-01-15 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279078#comment-14279078 ] Reynold Xin commented on SPARK-5097: [~hkothari] that is correct. It will be trivially

[jira] [Updated] (SPARK-5271) PySpark History Web UI issues

2015-01-15 Thread Andrey Zimovnov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Zimovnov updated SPARK-5271: --- Component/s: Web UI > PySpark History Web UI issues > - > >

[jira] [Created] (SPARK-5271) PySpark History Web UI issues

2015-01-15 Thread Andrey Zimovnov (JIRA)
Andrey Zimovnov created SPARK-5271: -- Summary: PySpark History Web UI issues Key: SPARK-5271 URL: https://issues.apache.org/jira/browse/SPARK-5271 Project: Spark Issue Type: Bug Affects V

[jira] [Updated] (SPARK-5268) CoarseGrainedExecutorBackend exits for irrelevant DisassociatedEvent

2015-01-15 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-5268: --- Priority: Blocker (was: Major) > CoarseGrainedExecutorBackend exits for irrelevant DisassociatedEvent > -

[jira] [Commented] (SPARK-5226) Add DBSCAN Clustering Algorithm to MLlib

2015-01-15 Thread Muhammad-Ali A'rabi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279170#comment-14279170 ] Muhammad-Ali A'rabi commented on SPARK-5226: This is DBSCAN algorithm: {nofor

[jira] [Comment Edited] (SPARK-5226) Add DBSCAN Clustering Algorithm to MLlib

2015-01-15 Thread Muhammad-Ali A'rabi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279170#comment-14279170 ] Muhammad-Ali A'rabi edited comment on SPARK-5226 at 1/15/15 7:33 PM: ---

[jira] [Resolved] (SPARK-5224) parallelize list/ndarray is really slow

2015-01-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-5224. --- Resolution: Fixed Fix Version/s: 1.2.1 1.3.0 Issue resolved by pull request

[jira] [Updated] (SPARK-5224) parallelize list/ndarray is really slow

2015-01-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5224: -- Assignee: Davies Liu > parallelize list/ndarray is really slow > ---

[jira] [Commented] (SPARK-5111) HiveContext and Thriftserver cannot work in secure cluster beyond hadoop2.5

2015-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279216#comment-14279216 ] Apache Spark commented on SPARK-5111: - User 'zhzhan' has created a pull request for th

[jira] [Created] (SPARK-5272) Refactor NaiveBayes to support discrete and continuous labels,features

2015-01-15 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5272: Summary: Refactor NaiveBayes to support discrete and continuous labels,features Key: SPARK-5272 URL: https://issues.apache.org/jira/browse/SPARK-5272 Project:

[jira] [Created] (SPARK-5273) Improve documentation examples for LinearRegression

2015-01-15 Thread Dev Lakhani (JIRA)
Dev Lakhani created SPARK-5273: -- Summary: Improve documentation examples for LinearRegression Key: SPARK-5273 URL: https://issues.apache.org/jira/browse/SPARK-5273 Project: Spark Issue Type: Im

[jira] [Commented] (SPARK-5272) Refactor NaiveBayes to support discrete and continuous labels,features

2015-01-15 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279235#comment-14279235 ] Joseph K. Bradley commented on SPARK-5272: -- My initial thoughts: (1) Are continu

[jira] [Comment Edited] (SPARK-5272) Refactor NaiveBayes to support discrete and continuous labels,features

2015-01-15 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279235#comment-14279235 ] Joseph K. Bradley edited comment on SPARK-5272 at 1/15/15 8:13 PM: -

[jira] [Commented] (SPARK-4894) Add Bernoulli-variant of Naive Bayes

2015-01-15 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279241#comment-14279241 ] Joseph K. Bradley commented on SPARK-4894: -- [~rnowling] I too don't want to hold

[jira] [Commented] (SPARK-4894) Add Bernoulli-variant of Naive Bayes

2015-01-15 Thread RJ Nowling (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279250#comment-14279250 ] RJ Nowling commented on SPARK-4894: --- Thanks, [~josephkb]! I'd be happy to help with the

[jira] [Commented] (SPARK-5012) Python API for Gaussian Mixture Model

2015-01-15 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279251#comment-14279251 ] Joseph K. Bradley commented on SPARK-5012: -- [~MeethuMathew], [~tgaloppo] makes a

[jira] [Commented] (SPARK-5272) Refactor NaiveBayes to support discrete and continuous labels,features

2015-01-15 Thread RJ Nowling (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279258#comment-14279258 ] RJ Nowling commented on SPARK-5272: --- Hi [~josephkb], I can see benefits to your sugges

[jira] [Commented] (SPARK-5272) Refactor NaiveBayes to support discrete and continuous labels,features

2015-01-15 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279269#comment-14279269 ] Joseph K. Bradley commented on SPARK-5272: -- I like the idea of supporting multipl

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2015-01-15 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279274#comment-14279274 ] Joseph K. Bradley commented on SPARK-1405: -- I'll try out the statmt dataset if th

[jira] [Comment Edited] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2015-01-15 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279274#comment-14279274 ] Joseph K. Bradley edited comment on SPARK-1405 at 1/15/15 9:29 PM: -

[jira] [Commented] (SPARK-5274) Stabilize UDFRegistration API

2015-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279352#comment-14279352 ] Apache Spark commented on SPARK-5274: - User 'rxin' has created a pull request for this

[jira] [Created] (SPARK-5274) Stabilize UDFRegistration API

2015-01-15 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-5274: -- Summary: Stabilize UDFRegistration API Key: SPARK-5274 URL: https://issues.apache.org/jira/browse/SPARK-5274 Project: Spark Issue Type: Sub-task Compon

[jira] [Commented] (SPARK-4879) Missing output partitions after job completes with speculative execution

2015-01-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279406#comment-14279406 ] Josh Rosen commented on SPARK-4879: --- I'm not sure that SparkHadoopWriter's use of FileOu

[jira] [Commented] (SPARK-5144) spark-yarn module should be published

2015-01-15 Thread Matthew Sanders (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279457#comment-14279457 ] Matthew Sanders commented on SPARK-5144: +1 -- I am in a similar situation and wou

[jira] [Commented] (SPARK-4746) integration tests should be separated from faster unit tests

2015-01-15 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279524#comment-14279524 ] Imran Rashid commented on SPARK-4746: - This doesn't work as well as I thought -- all o

[jira] [Created] (SPARK-5275) pyspark.streaming is not included in assembly jar

2015-01-15 Thread Davies Liu (JIRA)
Davies Liu created SPARK-5275: - Summary: pyspark.streaming is not included in assembly jar Key: SPARK-5275 URL: https://issues.apache.org/jira/browse/SPARK-5275 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-5276) pyspark.streaming is not included in assembly jar

2015-01-15 Thread Davies Liu (JIRA)
Davies Liu created SPARK-5276: - Summary: pyspark.streaming is not included in assembly jar Key: SPARK-5276 URL: https://issues.apache.org/jira/browse/SPARK-5276 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-5274) Stabilize UDFRegistration API

2015-01-15 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-5274. Resolution: Fixed Fix Version/s: 1.3.0 > Stabilize UDFRegistration API >

[jira] [Commented] (SPARK-3622) Provide a custom transformation that can output multiple RDDs

2015-01-15 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279602#comment-14279602 ] Imran Rashid commented on SPARK-3622: - In some ways this kinda reminds of the problem

[jira] [Created] (SPARK-5277) SparkSqlSerializer does not register user specified KryoRegistrators

2015-01-15 Thread Max Seiden (JIRA)
Max Seiden created SPARK-5277: - Summary: SparkSqlSerializer does not register user specified KryoRegistrators Key: SPARK-5277 URL: https://issues.apache.org/jira/browse/SPARK-5277 Project: Spark

[jira] [Updated] (SPARK-5277) SparkSqlSerializer does not register user specified KryoRegistrators

2015-01-15 Thread Max Seiden (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Seiden updated SPARK-5277: -- Remaining Estimate: (was: 24h) Original Estimate: (was: 24h) > SparkSqlSerializer does not

[jira] [Commented] (SPARK-5193) Make Spark SQL API usable in Java and remove the Java-specific API

2015-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279648#comment-14279648 ] Apache Spark commented on SPARK-5193: - User 'rxin' has created a pull request for this

  1   2   >