Re: StreamingContext.textFileStream issue

2015-04-25 Thread Yang Lei
, Apr 25, 2015 at 3:02 AM, Yang Lei genia...@gmail.com wrote: I hit the same issue as if the directory has no files at all when running the sample examples/src/main/python/streaming/hdfs_wordcount.py with a local directory, and adding file into that directory . Appreciate comments on how to resolve

Re: StreamingContext.textFileStream issue

2015-04-24 Thread Yang Lei
I hit the same issue as if the directory has no files at all when running the sample examples/src/main/python/streaming/hdfs_wordcount.py with a local directory, and adding file into that directory . Appreciate comments on how to resolve this. -- View this message in context:

Re: Spark on Mesos

2015-04-24 Thread Yang Lei
I run my Spark over Mesos by either running spark submit in a Docker container using Marathon or from one of the node in mesos cluster. I am on mesos 0.21. I have tried both spark 1.3.1 and 1.2.1 with rebuild of hadoop 2.4 and above. Some details on the configuration: I made sure that spark is

Issue of running partitioned loading (RDD) in Spark External Datasource on Mesos

2015-04-20 Thread Yang Lei
I implemented two kinds of DataSource, one load data during buildScan, the other returning my RDD class with partition information for future loading. My RDD's compute gets actorSystem from SparkEnv.get.actorSystem, then use Spray to interact with a HTTP endpoint, which is the same flow as

Cloudant as Spark SQL External Datastore on Spark 1.3.0

2015-03-19 Thread Yang Lei
Check this out : https://github.com/cloudant/spark-cloudant. It supports both the DataFrame and SQL approach for reading data from Cloudant and save it . Looking forward to your feedback on the project. Yang

Re: Question on Spark 1.3 SQL External Datasource

2015-03-17 Thread Yang Lei
Thanks Cheng for the clarification. Looking forward to this new API mentioned below. Yang Sent from my iPad On Mar 17, 2015, at 8:05 PM, Cheng Lian lian.cs@gmail.com wrote: Hey Yang, My comments are in-lined below. Cheng On 3/18/15 6:53 AM, Yang Lei wrote: Hello, I am

Question on Spark 1.3 SQL External Datasource

2015-03-17 Thread Yang Lei
Hello, I am migrating my Spark SQL external datasource integration from Spark 1.2.x to Spark 1.3. I noticed, there are a couple of new filters now, e.g. org.apache.spark.sql.sources.And. However, for a sql with condition A AND B, I noticed PrunedFilteredScan.buildScan still gets an