Re: Exception using the new createDirectStream util method

2015-03-20 Thread Alberto Rodriguez
You were absolutely right Cody!! I have just put a message in the kafka topic before creating the DirectStream and now is working fine! Do you think that I should open an issue to warn that the kafka topic must contain at least one message before the DirectStream creation? Thank you very much!

Contributing to Spark

2015-03-20 Thread vikas.v.i...@gmail.com
Hi , I have read and gone through most Spark tutorials and materials out there. I have also downloaded and build the spark code base . Can someone point me to some existing Jira where I can start contributing ? Eventually I want to do some good contribution to mlLib . Thanks, Vikas

Re: 答复: Contributing to Spark

2015-03-20 Thread vikas.v.i...@gmail.com
Jessica, thanks for links. I am aware of these but am looking for some ml related jira issues which I can contribute as starting point. Thanks, Vikas On Mar 20, 2015 2:12 PM, Tanyinyan tanyin...@huawei.com wrote: Hello Vikas, These two links maybe what you want. jira:

Re: Which linear algebra interface to use within Spark MLlib?

2015-03-20 Thread Burak Yavuz
Hi, We plan to add a more comprehensive local linear algebra package for MLlib 1.4. This local linear algebra package can then easily be extended to BlockMatrix to support the same operations in a distributed fashion. You may find the JIRA to track this here: SPARK-6442

Re: Error: 'SparkContext' object has no attribute 'getActiveStageIds'

2015-03-20 Thread Ted Yu
Please take a look at core/src/main/scala/org/apache/spark/SparkStatusTracker.scala, around line 58: def getActiveStageIds(): Array[Int] = { Cheers On Fri, Mar 20, 2015 at 3:59 PM, xing ehomec...@gmail.com wrote: getStageInfo in self._jtracker.getStageInfo below seems not

Re: Error: 'SparkContext' object has no attribute 'getActiveStageIds'

2015-03-20 Thread xing
getStageInfo in self._jtracker.getStageInfo below seems not implemented/included in the current python library. def getStageInfo(self, stageId): Returns a :class:`SparkStageInfo` object, or None if the stage info could not be found or was garbage collected.

Re: Exception using the new createDirectStream util method

2015-03-20 Thread Cody Koeninger
I went ahead and created https://issues.apache.org/jira/browse/SPARK-6434 to track this On Fri, Mar 20, 2015 at 2:44 AM, Alberto Rodriguez ardl...@gmail.com wrote: You were absolutely right Cody!! I have just put a message in the kafka topic before creating the DirectStream and now is

Filesystem closed Exception

2015-03-20 Thread Sea
Hi, all: When I exit the console of spark-sql, the following exception throwed.. Exception in thread Thread-3 java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629) at

Re: Spark SQL ExternalSorter not stopped

2015-03-20 Thread Yin Huai
Hi Michael, Thanks for reporting it. Yes, it is a bug. I have created https://issues.apache.org/jira/browse/SPARK-6437 to track it. Thanks, Yin On Thu, Mar 19, 2015 at 10:51 AM, Michael Allman mich...@videoamp.com wrote: I've examined the experimental support for ExternalSorter in Spark SQL,

Storage of RDDs created via sc.parallelize

2015-03-20 Thread Karlson
Hi all, where is the data stored that is passed to sc.parallelize? Or put differently, where is the data for the base RDD fetched from when the DAG is executed, if the base RDD is constructed via sc.parallelize? I am reading a csv file via the Python csv module and am feeding the parsed

Minor Edit in the programming guide

2015-03-20 Thread Muttineni, Vinay
Hey guys, In the Spark 1.3.0 documentation provided here, http://spark.apache.org/docs/latest/sql-programming-guide.html , Under the Programmatically Specifying the Schema section , it's mentioned that SQL data types are in the following package org.apache.spark.sql, but I guess it has changed

答复: Contributing to Spark

2015-03-20 Thread Tanyinyan
Hello Vikas, These two links maybe what you want. jira: https://issues.apache.org/jira/browse/SPARK/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel pull request: https://github.com/apache/spark/pulls Regards, Jessica -邮件原件- 发件人: vikas.v.i...@gmail.com

Connecting a worker to the master after a spark context is made

2015-03-20 Thread Niranda Perera
Hi, Please consider the following scenario. I've started the spark master by invoking the org.apache.spark.deploy.master.Master.startSystemAndActor method in a java code and connected a worker to it using the org.apache.spark.deploy.worker.Worker.startSystemAndActor method. and then I have

Directly broadcasting (sort of) RDDs

2015-03-20 Thread Guillaume Pitel
Hi, I have an idea that I would like to discuss with the Spark devs. The idea comes from a very real problem that I have struggled with since almost a year. My problem is very simple, it's a dense matrix * sparse matrix operation. I have a dense matrix RDD[(Int,FloatMatrix)] which is