from:"Yogesh Mahajan"

Re: Spark 2.3.0 and Custom Sink

2018-06-21 Thread Yogesh Mahajan

Since ForeachWriter works at a record level so you cannot do bulk ingest
into KairosDB, which supports bulk inserts. This will be slow.
Instead, you can have your own Sink implementation which is a batch
(DataFrame) level.

Thanks,
http://www.snappydata.io/blog 

On Thu, Jun 21, 2018 at 10:54 AM, subramgr 
wrote:

> Hi Spark Mailing list,
>
> We are looking for pushing the output of the structured streaming query
> output to KairosDB. (time series database)
>
> What would be the recommended way of doing this? Do we implement the *Sink*
> trait or do we use the *ForEachWriter*
>
> At each trigger point if I do a *dataset.collect()* the size of the data is
> not huge it should be in lower 10MBs
>
> Any suggestions?
>
> Thanks
> Girish
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Inefficient state management in stream to stream join in 2.3

2018-02-13 Thread Yogesh Mahajan

In 2.3, stream to stream joins(both Inner and Outer) are implemented using
symmetric hash join(SHJ) algorithm, and that is a good choice
and I am sure you had compared with other family of algorithms like XJoin
and non-blocking sort based algorithms like progressive merge join (PMJ

)

*From functional point of view - *
1. It considers most of the stream to stream join use cases and all the
considerations around event time and watermarks as joins keys are well
thought trough.
2. It also adopts an effective approach towards join state management is to
exploit 'hard' constraints in the input streams to reduce state rather than
exploiting statistical properties as 'soft' constraints.

*From performance point of view - *
Since SHJ assumes that the entire join state can be kept in main memory,
but the StateStore in Spark is backed by the HDFS compatible file system.
Also looking at the code StreamingSymmetricHashJoinExec here
,
two StateStores(KeyToNumValuesStore, KeyWithIndexToValueStore) are used and
multiple lookups to them in each
StreamExecution(MicroBatch/ContinuousExecution)
per partition per operator will have huge performance penalty even for a
moderate size of state of queries like groupBy “SYMBOL”

To overcome this perf hit, even though we implement our own efficient
in-memory StateStore, there is no way to avoid these multiple lookups
unless and until you have your own StreamingSymmetricHashJoinExec
implementation.

We should consider using efficient main-memory data structures described in
this paper

which are suited for storing sliding windows, with efficient support for
removing tuples that have fallen out of the state.

Other way to reduce unnecessary state using punctuations

(in contrast to existing way where constraints have to be known a priori). A
punctuation is a tuple of patterns specifying a predicate that must
evaluate to false for all future data tuples in the stream and these can be
inserted dynamically.

For example consider two streams join, auctionStream and bidStream. When a
particular auction closes, system inserts a punctuation into the bidStream
to signal that there will be no more bids for that particular auction
and purges
those tuples that cannot possibly join with future arrivals. PJoin
 is one example of stream join
algorithm which exploits punctuations.

Thanks,
http://www.snappydata.io/blog

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-02-13 Thread Yogesh Mahajan

I had a similar issue and i think that’s where the structured streaming
design lacks.
Seems like Question#2 in your email is a viable workaround for you.

In my case, I have a custom Sink backed by an efficient in-memory column
store suited for fast ingestion.

I have a Kafka stream coming from one topic, and I need to classify the
stream based on schema.
For example, a Kafka topic can have three different types of schema
messages and I would like to ingest into the three different column
tables(having different schema) using my custom Sink implementation.

Right now only(?) option I have is to create three streaming queries
reading the same topic and ingesting to respective column tables using
their Sink implementations.
These three streaming queries create underlying three IncrementalExecutions
and three KafkaSources, and three queries reading the same data from the
same Kafka topic.
Even with CachedKafkaConsumers at partition level, this is not an efficient
way to handle a simple streaming use case.

One workaround to overcome this limitation is to have same schema for all
the messages in a Kafka partition, unfortunately this is not in our control
and customers cannot change it due to their dependencies on other
subsystems.

Thanks,
http://www.snappydata.io/blog 

On Mon, Feb 12, 2018 at 5:54 PM, Priyank Shrivastava  wrote:

> I have a structured streaming query which sinks to Kafka.  This query has
> a complex aggregation logic.
>
>
> I would like to sink the output DF of this query to multiple Kafka topics
> each partitioned on a different ‘key’ column.  I don’t want to have
> multiple Kafka sinks for each of the different Kafka topics because that
> would mean running multiple streaming queries - one for each Kafka topic,
> especially since my aggregation logic is complex.
>
>
> Questions:
>
> 1.  Is there a way to output the results of a structured streaming query
> to multiple Kafka topics each with a different key column but without
> having to execute multiple streaming queries?
>
>
> 2.  If not,  would it be efficient to cascade the multiple queries such
> that the first query does the complex aggregation and writes output
> to Kafka and then the other queries just read the output of the first query
> and write their topics to Kafka thus avoiding doing the complex aggregation
> again?
>
>
> Thanks in advance for any help.
>
>
> Priyank
>
>
>

Re: Max number of streams supported ?

2018-01-31 Thread Yogesh Mahajan

Thanks Michael, TD for quick reply. It was helpful. I will let you know the
numbers(limit) based on my experiments.

On Wed, Jan 31, 2018 at 3:10 PM, Tathagata Das 
wrote:

> Just to clarify a subtle difference between DStreams and Structured
> Streaming. Multiple input streams in a DStreamGraph is likely to mean they
> are all being processed/computed in the same way as there can be only one
> streaming query / context active in the StreamingContext. However, in the
> case of Structured Streaming, there can be any number of independent
> streaming queries (i.e. different computations), and each streaming query
> with any number if separate input sources. So Michael's comment of "each
> stream will have a thread on the driver" is correct when there are many
> independent queries with different computations simultaneously running.
> However if all your streams need to be processed in the same way, then its
> one streaming query with many inputs, and will require one thread.
>
> Hope this helps.
>
> TD
>
> On Wed, Jan 31, 2018 at 12:39 PM, Michael Armbrust  > wrote:
>
>> -dev +user
>>
>>
>>> Similarly for structured streaming, Would there be any limit on number
>>> of of streaming sources I can have ?
>>>
>>
>> There is no fundamental limit, but each stream will have a thread on the
>> driver that is doing coordination of execution.  We comfortably run 20+
>> streams on a single cluster in production, but I have not pushed the
>> limits.  You'd want to test with your specific application.
>>
>
>

Re: Do I need to install Cassandra node on Spark Master node to work with Cassandra?

2016-05-04 Thread Yogesh Mahajan

You can have a Spark master where Cassandra is not running locally.  I have
tried this before.
Spark cluster and Cassandra cluster could be on two different hosts, but to
colocate, you can have both the executor and Cassandra node on same host.

Thanks,
http://www.snappydata.io/blog 

On Thu, May 5, 2016 at 6:06 AM, Vinayak Agrawal 
wrote:

> Hi All,
> I am working with a Cassandra cluster and moving towards installing Spark.
> However, I came across this Stackoverflow question which has confused me.
>
> http://stackoverflow.com/questions/33897586/apache-spark-driver-instead-of-just-the-executors-tries-to-connect-to-cassand
>
> Question:
> Do I need to install cassandra node on my Spark Master node so that Spark
> can connect with cassandra
> or
> Cassandra only needs to be on Spark worker nodes? It seemss logical
> considering data locality.
>
> Thanks
>
> --
> Vinayak Agrawal
>
>
> "To Strive, To Seek, To Find and Not to Yield!"
> ~Lord Alfred Tennyson
>

Re: [Streaming] textFileStream has no events shown in web UI

2016-04-11 Thread Yogesh Mahajan

Yes, this has observed in my case also. The Input Rate is 0 even in case of
rawSocketStream.
Is there a way we can enable the Input Rate for these types of streams ?

Thanks,
http://www.snappydata.io/blog 

On Wed, Mar 16, 2016 at 4:21 PM, Hao Ren  wrote:

> Just a quick question,
>
> When using textFileStream, I did not see any events via web UI.
> Actually, I am uploading files to s3 every 5 seconds,
> And the mini-batch duration is 30 seconds.
> On web ui,:
>
>  *Input Rate*
> Avg: 0.00 events/sec
>
> But the schedule time and processing time are correct, and the output of
> the steam is also correct. Not sure why web ui has not detected any events.
>
> Thank you.
>
> --
> Hao Ren
>
> Data Engineer @ leboncoin
>
> Paris, France
>

Re: Scala types to StructType

2016-02-11 Thread Yogesh Mahajan

CatatlystTypeConverters.scala has all types of utility methods to convert
from Scala to row and vice a versa.

On Fri, Feb 12, 2016 at 12:21 AM, Rishabh Wadhawan 
wrote:

> I had the same issue. I resolved it in Java, but I am pretty sure it would
> work with scala too. Its kind of a gross hack. But what I did is say I had
> a table in Mysql with 1000 columns
> what is did is that I threw a jdbc query to extracted the schema of the
> table. I stored that schema and wrote a map function to create StructFields
> using structType and Row.Factory. Then I took that table loaded as a
> dataFrame, event though it had a schema. I converted that data frame into
> an RDD, this is when it lost the schema. Then performed something using
> that RDD and then converted back that RDD with the structfield.
> If your source is structured type then it would be better if you can load
> it directly as a DF that way you can preserve the schema. However, in your
> case you should do something like this
> List fields = new ArrayList
> for(keys in MAP)
>  fields.add(DataTypes.createStructField(keys, DataTypes.StringType, true
> ));
>
> StrructType schemaOfDataFrame = DataTypes.createStructType(conffields);
>
> sqlcontext.createDataFrame(rdd, schemaOfDataFrame);
>
> This is how I would do it to make it in Java, not sure about scala syntax.
> Please tell me if that helped.
>
> On Feb 11, 2016, at 7:20 AM, Fabian Böhnlein 
> wrote:
>
> Hi all,
>
> is there a way to create a Spark SQL Row schema based on Scala data types
> without creating a manual mapping?
>
> That's the only example I can find which doesn't require
> spark.sql.types.DataType already as input, but it requires to define them
> as Strings.
>
> * val struct = (new StructType)*   .add("a", "int")*   .add("b", "long")*   
> .add("c", "string")
>
>
>
> Specifically I have an RDD where each element is a Map of 100s of
> variables with different data types which I want to transform to a DataFrame
> where the keys should end up as the column names:
>
> Map ("Amean" -> 20.3, "Asize" -> 12, "Bmean" -> )
>
>
> Is there a different possibility than building a mapping from the values'
> .getClass to the Spark SQL DataTypes?
>
>
> Thanks,
> Fabian
>
>
>
>
>

Re: Scala types to StructType

2016-02-11 Thread Yogesh Mahajan

Right, Thanks Ted.

On Fri, Feb 12, 2016 at 10:21 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> Minor correction: the class is CatalystTypeConverters.scala
>
> On Thu, Feb 11, 2016 at 8:46 PM, Yogesh Mahajan <ymaha...@snappydata.io>
> wrote:
>
>> CatatlystTypeConverters.scala has all types of utility methods to convert
>> from Scala to row and vice a versa.
>>
>>
>> On Fri, Feb 12, 2016 at 12:21 AM, Rishabh Wadhawan <rishabh...@gmail.com>
>> wrote:
>>
>>> I had the same issue. I resolved it in Java, but I am pretty sure it
>>> would work with scala too. Its kind of a gross hack. But what I did is say
>>> I had a table in Mysql with 1000 columns
>>> what is did is that I threw a jdbc query to extracted the schema of the
>>> table. I stored that schema and wrote a map function to create StructFields
>>> using structType and Row.Factory. Then I took that table loaded as a
>>> dataFrame, event though it had a schema. I converted that data frame into
>>> an RDD, this is when it lost the schema. Then performed something using
>>> that RDD and then converted back that RDD with the structfield.
>>> If your source is structured type then it would be better if you can
>>> load it directly as a DF that way you can preserve the schema. However, in
>>> your case you should do something like this
>>> List fields = new ArrayList
>>> for(keys in MAP)
>>>  fields.add(DataTypes.createStructField(keys, DataTypes.StringType, true
>>> ));
>>>
>>> StrructType schemaOfDataFrame = DataTypes.createStructType(conffields);
>>>
>>> sqlcontext.createDataFrame(rdd, schemaOfDataFrame);
>>>
>>> This is how I would do it to make it in Java, not sure about scala
>>> syntax. Please tell me if that helped.
>>>
>>> On Feb 11, 2016, at 7:20 AM, Fabian Böhnlein <fabian.boehnl...@gmail.com>
>>> wrote:
>>>
>>> Hi all,
>>>
>>> is there a way to create a Spark SQL Row schema based on Scala data
>>> types without creating a manual mapping?
>>>
>>> That's the only example I can find which doesn't require
>>> spark.sql.types.DataType already as input, but it requires to define them
>>> as Strings.
>>>
>>> * val struct = (new StructType)*   .add("a", "int")*   .add("b", "long")*   
>>> .add("c", "string")
>>>
>>>
>>>
>>> Specifically I have an RDD where each element is a Map of 100s of
>>> variables with different data types which I want to transform to a DataFrame
>>> where the keys should end up as the column names:
>>>
>>> Map ("Amean" -> 20.3, "Asize" -> 12, "Bmean" -> )
>>>
>>>
>>> Is there a different possibility than building a mapping from the
>>> values' .getClass to the Spark SQL DataTypes?
>>>
>>>
>>> Thanks,
>>> Fabian
>>>
>>>
>>>
>>>
>>>
>>
>

Re: Spark Streaming : Limiting number of receivers per executor

2016-02-10 Thread Yogesh Mahajan

Hi Ajay,

Have you overridden Receiver#preferredLocation method in your custom
Receiver? You can specify hostname for your Receiver. Check the
ReceiverSchedulingPolicy#scheduleReceivers, it should honor your
preferredLocation value for Receiver scheduling.


On Wed, Feb 10, 2016 at 4:04 PM, ajay garg  wrote:

> Hi All,
>  I am running 3 executors in my spark streaming application with 3
> cores per executors. I have written my custom receiver for receiving
> network
> data.
>
> In my current configuration I am launching 3 receivers , one receiver per
> executor.
>
> In the run if 2 of my executor dies, I am left with only one executor and
> all 3 receivers are scheduled on that executor. Since this executor has
> only
> 3 cores and all cores are busy running 3 receivers, Action on accumulated
> window data(DStream) is not scheduled and my application hangs.
>
> Is there a way to restrict number of receivers per executor so that I am
> always left with some core to run action on DStream.
>
> Thanks
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Limiting-number-of-receivers-per-executor-tp26192.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Explaination for info shown in UI

2016-02-01 Thread Yogesh Mahajan

The jobs depend on the number of output operations (print, foreachRDD,
saveAs*Files) and the number of RDD actions in those output operations.

For example:
dstream1.foreachRDD { rdd => rdd.count }// ONE Spark job per batch
dstream1.foreachRDD { rdd => { rdd.count ; rdd.count } } // TWO Spark jobs
per batch
dstream1.foreachRDD { rdd => rdd.count } ; dstream2.foreachRDD { rdd =>
rdd.count }  // TWO Spark jobs per batch

Regards,
Yogesh Mahajan
SnappyData Inc (snappydata.io)

On Thu, Jan 28, 2016 at 4:30 PM, Sachin Aggarwal <different.sac...@gmail.com
> wrote:

> Hi
>
> I am executing a streaming wordcount with kafka
> with one test topic with 2 partition
> my cluster have three spark executors
>
> Each batch is of 10 sec
>
> for every batch(ex below * batch time 02:51:00*) I see 3 entry in spark
> UI , as shown below below
>
> my questions:-
> 1) As label says jobId for first column, does spark submits 3 jobs for
> each batch ?
> 2) I tried decreasing executers/nodes the job count is also getting
> changed what is the relation with no of  executors?
> 3) only one job actually executes the stage rest two shows skipped why
> other jobs got created?
>
> Job IdDescriptionSubmittedDurationStages: Succeeded/TotalTasks (for all
> stages): Succeeded/Total
> 221 Streaming job from [output operation 0, batch time 02:51:00] print at
> StreamingWordCount.scala:54 2016/01/28 02:51:00 46 ms 1/1 (1 skipped)
> 1/1 (3 skipped)
> 220 Streaming job from [output operation 0, batch time 02:51:00] print at
> StreamingWordCount.scala:54 2016/01/28 02:51:00 47 ms 1/1 (1 skipped)
> 4/4 (3 skipped)
> 219 Streaming job from [output operation 0, batch time 02:51:00] print at
> StreamingWordCount.scala:54 2016/01/28 02:51:00 48 ms 2/2
> 4/4
>
> --
>
> Thanks & Regards
>
> Sachin Aggarwal
> 7760502772
>

Re: 答复: spark streaming context trigger invoke stop why?

2016-01-13 Thread Yogesh Mahajan

Hi Triones,

Check the org.apache.spark.util.ShutdownHookManager : It adds this
ShutDownHook when you start a StreamingContext

Here is the code in StreamingContext.start()

shutdownHookRef = ShutdownHookManager.addShutdownHook(
  StreamingContext.SHUTDOWN_HOOK_PRIORITY)(stopOnShutdown)

Also looke at the following def in StreamingContext which actually stops
the context from shutdown hook :
private def stopOnShutdown(): Unit = {
val stopGracefully =
conf.getBoolean("spark.streaming.stopGracefullyOnShutdown", false)
logInfo(s"Invoking stop(stopGracefully=$stopGracefully) from shutdown
hook")
// Do not stop SparkContext, let its own shutdown hook stop it
stop(stopSparkContext = false, stopGracefully = stopGracefully)
}

Regards,
Yogesh Mahajan,
SnappyData Inc, snappydata.io

On Thu, Jan 14, 2016 at 8:55 AM, Triones,Deng(vip.com) <
triones.d...@vipshop.com> wrote:

> More info
>
>
>
> I am using spark version 1.5.2
>
>
>
>
>
> *发件人:* Triones,Deng(vip.com) [mailto:triones.d...@vipshop.com]
> *发送时间:* 2016年1月14日 11:24
> *收件人:* user
> *主题:* spark streaming context trigger invoke stop why?
>
>
>
> Hi all
>
>  As I saw the driver log, the task failed 4 times in a stage, the
> stage will be dropped when the input block was deleted before make use of.
> After that the StreamingContext invoke stop.  Does anyone know what kind of
> akka message trigger the stop or which code trigger the shutdown hook?
>
>
>
>
>
> Thanks
>
>
>
>
>
>
>
>
>
> Driver log:
>
>
>
>  Job aborted due to stage failure: Task 410 in stage 215.0 failed 4 times
>
> [org.apache.spark.streaming.StreamingContext---Thread-0]: Invoking
> stop(stopGracefully=false) from shutdown hook
>
>
> 本电子邮件可能为保密文件。如果阁下非电子邮件所指定之收件人，谨请立即通知本人。敬请阁下不要使用、保存、复印、打印、散布本电子邮件及其内容，或将其用于其他任何目的或向任何人披露。谢谢您的合作！
> This communication is intended only for the addressee(s) and may contain
> information that is privileged and confidential. You are hereby notified
> that, if you are not an intended recipient listed above, or an authorized
> employee or agent of an addressee of this communication responsible for
> delivering e-mail messages to an intended recipient, any dissemination,
> distribution or reproduction of this communication (including any
> attachments hereto) is strictly prohibited. If you have received this
> communication in error, please notify us immediately by a reply e-mail
> addressed to the sender and permanently delete the original e-mail
> communication and any attachments from all storage devices without making
> or otherwise retaining a copy.
> 本电子邮件可能为保密文件。如果阁下非电子邮件所指定之收件人，谨请立即通知本人。敬请阁下不要使用、保存、复印、打印、散布本电子邮件及其内容，或将其用于其他任何目的或向任何人披露。谢谢您的合作！
> This communication is intended only for the addressee(s) and may contain
> information that is privileged and confidential. You are hereby notified
> that, if you are not an intended recipient listed above, or an authorized
> employee or agent of an addressee of this communication responsible for
> delivering e-mail messages to an intended recipient, any dissemination,
> distribution or reproduction of this communication (including any
> attachments hereto) is strictly prohibited. If you have received this
> communication in error, please notify us immediately by a reply e-mail
> addressed to the sender and permanently delete the original e-mail
> communication and any attachments from all storage devices without making
> or otherwise retaining a copy.
>

Re: Manipulate Twitter Stream Filter on runtime

2016-01-13 Thread Yogesh Mahajan

Hi Alem,

I haven't tried it, but can you give a try and TwitterStream.clenup and add
your modified filter if it works ?  I am using twitter4j 4.0.4 with spark
streaming.

Regards,
Yogesh Mahajan
SnappyData Inc, snappydata.io

On Mon, Jan 11, 2016 at 6:43 PM, Filli Alem <alem.fi...@ti8m.ch> wrote:

> Hi,
>
>
>
> I try to implement a twitter stream processing, where I would want to
> change the filtered keywords during run time. I implemented the twitter
> stream with a custom receiver which works fine. I’m stuck with the runtime
> alteration now.
>
>
>
> Any ideas?
>
>
>
> Thanks
>
> Alem
>
>
>
>
> <https://www.ti8m.ch/de/offering/products.html>
>

Re: 答复: spark streaming context trigger invoke stop why?

2016-01-13 Thread Yogesh Mahajan

All the action happens in ApplicationMaster expecially in run method
Check ApplicationMaster#startUserApplication : userThread(Driver) which
invokes ApplicationMaster#finish method. You can also try System.exit in
your program

Regards,
Yogesh Mahajan,
SnappyData Inc, snappydata.io

On Thu, Jan 14, 2016 at 9:56 AM, Yogesh Mahajan <ymaha...@snappydata.io>
wrote:

> Hi Triones,
>
> Check the org.apache.spark.util.ShutdownHookManager : It adds this
> ShutDownHook when you start a StreamingContext
>
> Here is the code in StreamingContext.start()
>
> shutdownHookRef = ShutdownHookManager.addShutdownHook(
>   StreamingContext.SHUTDOWN_HOOK_PRIORITY)(stopOnShutdown)
>
> Also looke at the following def in StreamingContext which actually stops
> the context from shutdown hook :
> private def stopOnShutdown(): Unit = {
> val stopGracefully =
> conf.getBoolean("spark.streaming.stopGracefullyOnShutdown", false)
> logInfo(s"Invoking stop(stopGracefully=$stopGracefully) from shutdown
> hook")
> // Do not stop SparkContext, let its own shutdown hook stop it
> stop(stopSparkContext = false, stopGracefully = stopGracefully)
> }
>
> Regards,
> Yogesh Mahajan,
> SnappyData Inc, snappydata.io
>
> On Thu, Jan 14, 2016 at 8:55 AM, Triones,Deng(vip.com) <
> triones.d...@vipshop.com> wrote:
>
>> More info
>>
>>
>>
>> I am using spark version 1.5.2
>>
>>
>>
>>
>>
>> *发件人:* Triones,Deng(vip.com) [mailto:triones.d...@vipshop.com]
>> *发送时间:* 2016年1月14日 11:24
>> *收件人:* user
>> *主题:* spark streaming context trigger invoke stop why?
>>
>>
>>
>> Hi all
>>
>>  As I saw the driver log, the task failed 4 times in a stage, the
>> stage will be dropped when the input block was deleted before make use of.
>> After that the StreamingContext invoke stop.  Does anyone know what kind of
>> akka message trigger the stop or which code trigger the shutdown hook?
>>
>>
>>
>>
>>
>> Thanks
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Driver log:
>>
>>
>>
>>  Job aborted due to stage failure: Task 410 in stage 215.0 failed 4 times
>>
>> [org.apache.spark.streaming.StreamingContext---Thread-0]: Invoking
>> stop(stopGracefully=false) from shutdown hook
>>
>>
>> 本电子邮件可能为保密文件。如果阁下非电子邮件所指定之收件人，谨请立即通知本人。敬请阁下不要使用、保存、复印、打印、散布本电子邮件及其内容，或将其用于其他任何目的或向任何人披露。谢谢您的合作！
>> This communication is intended only for the addressee(s) and may contain
>> information that is privileged and confidential. You are hereby notified
>> that, if you are not an intended recipient listed above, or an authorized
>> employee or agent of an addressee of this communication responsible for
>> delivering e-mail messages to an intended recipient, any dissemination,
>> distribution or reproduction of this communication (including any
>> attachments hereto) is strictly prohibited. If you have received this
>> communication in error, please notify us immediately by a reply e-mail
>> addressed to the sender and permanently delete the original e-mail
>> communication and any attachments from all storage devices without making
>> or otherwise retaining a copy.
>> 本电子邮件可能为保密文件。如果阁下非电子邮件所指定之收件人，谨请立即通知本人。敬请阁下不要使用、保存、复印、打印、散布本电子邮件及其内容，或将其用于其他任何目的或向任何人披露。谢谢您的合作！
>> This communication is intended only for the addressee(s) and may contain
>> information that is privileged and confidential. You are hereby notified
>> that, if you are not an intended recipient listed above, or an authorized
>> employee or agent of an addressee of this communication responsible for
>> delivering e-mail messages to an intended recipient, any dissemination,
>> distribution or reproduction of this communication (including any
>> attachments hereto) is strictly prohibited. If you have received this
>> communication in error, please notify us immediately by a reply e-mail
>> addressed to the sender and permanently delete the original e-mail
>> communication and any attachments from all storage devices without making
>> or otherwise retaining a copy.
>>
>
>

New spark meetup

2015-09-30 Thread Yogesh Mahajan

Hi,

Can you please get this new spark meetup listed on the spark community page -
http://spark.apache.org/community.html#events

Here is a link for the meetup in Pune, India  :  
http://www.meetup.com/Pune-Apache-Spark-Meetup/

Thanks,
Yogesh

Sent from my iPhone

Re: Spark 2.3.0 and Custom Sink

Inefficient state management in stream to stream join in 2.3

Re: [Structured Streaming] Avoiding multiple streaming queries

Re: Max number of streams supported ?

Re: Do I need to install Cassandra node on Spark Master node to work with Cassandra?

Re: [Streaming] textFileStream has no events shown in web UI

Re: Scala types to StructType

Re: Scala types to StructType

Re: Spark Streaming : Limiting number of receivers per executor

Re: Explaination for info shown in UI

Re: 答复: spark streaming context trigger invoke stop why?

Re: Manipulate Twitter Stream Filter on runtime

Re: 答复: spark streaming context trigger invoke stop why?

New spark meetup

14 matches

Site Navigation

Mail list logo

Footer information