Hi,
Thank you for all those answers.

The below is code I am trying out

val records = sparkSession.read.format("csv").stream("/tmp/input")

val re = records.write.format("parquet").trigger(ProcessingTime(100.seconds)).
  option("checkpointLocation", "/tmp/checkpoint")
  .startStream("/tmp/output")


re.awaitTermination()


In above code, I assume batch size is 100 seconds? But it doesn't
seems to be that way.


On Fri, May 6, 2016 at 3:14 PM, Sachin Aggarwal <different.sac...@gmail.com>
wrote:

> Hi Madhukara,
>
> What I understood from the code is that when ever runBatch return they
> trigger constructBatch so whatever is processing time for a batch will be
> ur batch time if u dnt specify a trigger.
>
> one flaw which i think in this is if your processing time keeps increasing
> with amount of data , then this batch interval keeps on increasing, they
> must put some boundary or some logic to block to prevent such case.
>
> here is one jira which i found related to this:-
> https://github.com/apache/spark/pull/12725
>
>
> On Fri, May 6, 2016 at 2:50 PM, Deepak Sharma <deepakmc...@gmail.com>
> wrote:
>
>> With Structured Streaming ,Spark would provide apis over spark sql engine.
>> Its like once you have the structured stream and dataframe created out of
>> this , you can do ad-hoc querying on the DF , which means you are actually
>> querying the stram without having to store or transform.
>> I have not used it yet but seems it will be like start streaming data
>> from source  as son as you define it.
>>
>> Thanks
>> Deepak
>>
>>
>> On Fri, May 6, 2016 at 1:37 PM, madhu phatak <phatak....@gmail.com>
>> wrote:
>>
>>> Hi,
>>> As I was playing with new structured streaming API, I noticed that spark
>>> starts processing as and when the data appears. It's no more seems like
>>> micro batch processing. Is spark structured streaming will be an event
>>> based processing?
>>>
>>> --
>>> Regards,
>>> Madhukara Phatak
>>> http://datamantra.io/
>>>
>>
>>
>>
>> --
>> Thanks
>> Deepak
>> www.bigdatabig.com
>> www.keosha.net
>>
>
>
>
> --
>
> Thanks & Regards
>
> Sachin Aggarwal
> 7760502772
>



-- 
Regards,
Madhukara Phatak
http://datamantra.io/

Reply via email to