Re: Re: spark streaming and kinesis integration

2023-04-12 Thread Mich Talebzadeh
evolving. So if anyone is interested, > please support the project. > > -- > Lingzhe Sun > Hirain Technologies > > > *From:* Mich Talebzadeh > *Date:* 2023-04-11 02:06 > *To:* Rajesh Katkar > *CC:* user > *Subject:* Re: spark streami

Re: Re: spark streaming and kinesis integration

2023-04-12 Thread 孙令哲
support the project. Lingzhe Sun Hirain Technologies From: Mich Talebzadeh Date: 2023-04-11 02:06 To: Rajesh Katkar CC: user Subject: Re: spark streaming and kinesis integration What I said was this"In so far as I know k8s does not support spark structured streaming?" So it is an ope

Re: Re: spark streaming and kinesis integration

2023-04-12 Thread Yi Huang
;> >> >> *From:* Mich Talebzadeh >> *Date:* 2023-04-11 02:06 >> *To:* Rajesh Katkar >> *CC:* user >> *Subject:* Re: spark streaming and kinesis integration >> What I said was this >> "In so far as I know k8s does not support spark structured stre

Re: Re: spark streaming and kinesis integration

2023-04-12 Thread Rajesh Katkar
s interested, > please support the project. > > -- > Lingzhe Sun > Hirain Technologies > > > *From:* Mich Talebzadeh > *Date:* 2023-04-11 02:06 > *To:* Rajesh Katkar > *CC:* user > *Subject:* Re: spark streaming and kinesis integration

Re: Re: spark streaming and kinesis integration

2023-04-11 Thread Lingzhe Sun
nt to read/write to kinesis streams using k8s Officially I could not find the connector or reader for kinesis from spark like it has for kafka. Checking here if anyone used kinesis and spark streaming combination ? On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, wrote: Hi Rajesh, What is th

Re: spark streaming and kinesis integration

2023-04-10 Thread Mich Talebzadeh
; disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> >>> On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar >>> wrote: >>> >>>> Use case

Re: spark streaming and kinesis integration

2023-04-10 Thread Mich Talebzadeh
t;> >> On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar >> wrote: >> >>> Use case is , we want to read/write to kinesis streams using k8s >>> Officially I could not find the connector or reader for kinesis from >>> spark like it has for kafka. >>>

Re: spark streaming and kinesis integration

2023-04-10 Thread Rajesh Katkar
le for any monetary damages arising from > such loss, damage or destruction. > > > > > On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar > wrote: > >> Use case is , we want to read/write to kinesis streams using k8s >> Officially I could not find the connector or reader

Re: spark streaming and kinesis integration

2023-04-06 Thread Rajesh Katkar
Use case is , we want to read/write to kinesis streams using k8s Officially I could not find the connector or reader for kinesis from spark like it has for kafka. Checking here if anyone used kinesis and spark streaming combination ? On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, wrote: >

RE: spark streaming and kinesis integration

2023-04-06 Thread Jonske, Kurt
kar Cc: u...@spark.incubator.apache.org Subject: Re: spark streaming and kinesis integration ⚠ [EXTERNAL EMAIL]: Use Caution Do you have a high level diagram of the proposed solution? In so far as I know k8s does not support spark structured streaming? Mich Talebzadeh, Lead Solutions

Re: spark streaming and kinesis integration

2023-04-06 Thread Mich Talebzadeh
kinesis from spark > like it has for kafka. > > Checking here if anyone used kinesis and spark streaming combination ? > > On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, > wrote: > >> Hi Rajesh, >> >> What is the use case for Kinesis here? I have not used it pe

Re: spark streaming and kinesis integration

2023-04-06 Thread Mich Talebzadeh
elying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar wrote: > Hi Spark Team, > > We need to read/write the kinesis streams using

spark streaming and kinesis integration

2023-04-06 Thread Rajesh Katkar
Hi Spark Team, We need to read/write the kinesis streams using spark streaming. We checked the official documentation - https://spark.apache.org/docs/latest/streaming-kinesis-integration.html It does not mention kinesis connector. Alternative is - https://github.com/qubole/kinesis-sql which

Re: spark streaming with kinesis

2016-11-20 Thread Takeshi Yamamuro
16 at 1:59 PM, Shushant Arora <shushantaror...@gmail.com> wrote: > Hi > > Thanks. > Have a doubt on spark streaming kinesis consumer. Say I have a batch time > of 500 ms and kiensis stream is partitioned on userid(uniformly > distributed).But since IdleTimeBetweenReadsIn

Re: spark streaming with kinesis

2016-11-20 Thread Shushant Arora
Hi Thanks. Have a doubt on spark streaming kinesis consumer. Say I have a batch time of 500 ms and kiensis stream is partitioned on userid(uniformly distributed).But since IdleTimeBetweenReadsInMillis is set to 1000ms so Spark receiver nodes will fetch the data at interval of 1 second and store

Re: spark streaming with kinesis

2016-11-14 Thread Takeshi Yamamuro
Mon, Nov 14, 2016 at 5:43 PM, Takeshi Yamamuro <linguin@gmail.com >>> > wrote: >>> >>>> Hi, >>>> >>>> The time interval can be controlled by `IdleTimeBetweenReadsInMillis` >>>> in KinesisClientLibConfiguration though, >>

Re: spark streaming with kinesis

2016-11-14 Thread Shushant Arora
configurable in the current implementation. >>> >>> The detail can be found in; >>> https://github.com/apache/spark/blob/master/external/kinesis >>> -asl/src/main/scala/org/apache/spark/streaming/kinesis/ >>> KinesisReceiver.scala#L152 >>> &g

Re: spark streaming with kinesis

2016-11-14 Thread Takeshi Yamamuro
rable in the current implementation. >> >> The detail can be found in; >> https://github.com/apache/spark/blob/master/external/kinesis >> -asl/src/main/scala/org/apache/spark/streaming/kinesis >> /KinesisReceiver.scala#L152 >> >&

Re: spark streaming with kinesis

2016-11-14 Thread Shushant Arora
ntLibConfiguration > though, > it is not configurable in the current implementation. > > The detail can be found in; > https://github.com/apache/spark/blob/master/external/ > kinesis-asl/src/main/scala/org/apache/spark/streaming/ > kinesis/KinesisReceiver.scala#L152 > > //

Re: spark streaming with kinesis

2016-11-14 Thread Takeshi Yamamuro
/streaming/kinesis/KinesisReceiver.scala#L152 // maropu On Sun, Nov 13, 2016 at 12:08 AM, Shushant Arora <shushantaror...@gmail.com> wrote: > *Hi * > > *is **spark.streaming.blockInterval* for kinesis input stream is > hardcoded to 1 sec or is it configurable ? Time interval

spark streaming with kinesis

2016-11-12 Thread Shushant Arora
*Hi * *is **spark.streaming.blockInterval* for kinesis input stream is hardcoded to 1 sec or is it configurable ? Time interval at which receiver fetched data from kinesis . Means stream batch interval cannot be less than *spark.streaming.blockInterval and this should be configrable , Also is

Re: spark streaming with kinesis

2016-11-07 Thread Takeshi Yamamuro
afka in kinesis spark streaming? > > Is there any limitation on interval checkpoint - minimum of 1second in > spark streaming with kinesis. But as such there is no limit on checkpoint > interval in KCL side ? > > Thanks > > On Tue, Oct 25, 2016 at 8:36 AM, Takeshi Yamam

Re: spark streaming with kinesis

2016-11-06 Thread Shushant Arora
Hi By receicer I meant spark streaming receiver architecture- means worker nodes are different than receiver nodes. There is no direct consumer/low level consumer like of Kafka in kinesis spark streaming? Is there any limitation on interval checkpoint - minimum of 1second in spark streaming

Spark Streaming and Kinesis

2016-10-27 Thread Benjamin Kim
Has anyone worked with AWS Kinesis and retrieved data from it using Spark Streaming? I am having issues where it’s returning no data. I can connect to the Kinesis stream and describe using Spark. Is there something I’m missing? Are there specific IAM security settings needed? I just simply

Re: spark streaming with kinesis

2016-10-24 Thread Takeshi Yamamuro
heckpoint the sequence no using some api. > > > > On Tue, Oct 25, 2016 at 7:07 AM, Takeshi Yamamuro <linguin@gmail.com> > wrote: > >> Hi, >> >> The only thing you can do for Kinesis checkpoints is tune the interval of >> them. >> https://github.com/apach

Re: spark streaming with kinesis

2016-10-24 Thread Shushant Arora
gt; replicated across executors. > However, all the executors that have the replicated data crash, > IIUC the dataloss occurs. > > // maropu > > On Mon, Oct 24, 2016 at 4:43 PM, Shushant Arora <shushantaror...@gmail.com > > wrote: > >> Does spark streaming c

Re: spark streaming with kinesis

2016-10-24 Thread Takeshi Yamamuro
Hi, The only thing you can do for Kinesis checkpoints is tune the interval of them. https://github.com/apache/spark/blob/master/external/ kinesis-asl/src/main/scala/org/apache/spark/streaming/ kinesis/KinesisUtils.scala#L68 Whether the dataloss occurs or not depends on the storage level you set

spark streaming with kinesis

2016-10-24 Thread Shushant Arora
Does spark streaming consumer for kinesis uses Kinesis Client Library and mandates to checkpoint the sequence number of shards in dynamo db. Will it lead to dataloss if consumed datarecords are not yet processed and kinesis checkpointed the consumed sequenece numbers in dynamo db and spark

Re: Spark streaming with Kinesis broken?

2015-12-11 Thread Nick Pentreath
0, canceled 0, ignored 0, pending 0 All tests passed. So this is a regression in Spark Streaming Kinesis 1.5.2 - @Brian can you file a JIRA for this? @dev-list, since KCL brings in AWS SDK dependencies itself, is it necessary to declare an explicit dependency on aws-java-sdk in the Kinesis POM

Re: Spark streaming with Kinesis broken?

2015-12-11 Thread Brian London
>>> - shutdown should checkpoint if the reason is TERMINATE >>> - shutdown should not checkpoint if the reason is something other than >>> TERMINATE >>> - retry success on first attempt >>> - retry success on second attempt after a Kinesis throttling exception >

Re: Spark streaming with Kinesis broken?

2015-12-11 Thread Brian London
a Kinesis throttling exception > - retry success on second attempt after a Kinesis dependency exception > - retry failed after a shutdown exception > - retry failed after an invalid state exception > - retry failed after unexpected exception > - retry failed after exhausing all retries > Ru

Re: Spark streaming with Kinesis broken?

2015-12-11 Thread Nick Pentreath
reason is TERMINATE >> - shutdown should not checkpoint if the reason is something other than >> TERMINATE >> - retry success on first attempt >> - retry success on second attempt after a Kinesis throttling exception >> - retry success on second attempt after a Kinesis

Re: Spark streaming with Kinesis broken?

2015-12-10 Thread Brian London
Nick's symptoms sound identical to mine. I should mention that I just pulled the latest version from github and it seems to be working there. To reproduce: 1. Download spark 1.5.2 from http://spark.apache.org/downloads.html 2. build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0

Re: Spark streaming with Kinesis broken?

2015-12-10 Thread Jean-Baptiste Onofré
Hi Nick, Just to be sure: don't you see some ClassCastException in the log ? Thanks, Regards JB On 12/10/2015 07:56 PM, Nick Pentreath wrote: Could you provide an example / test case and more detail on what issue you're facing? I've just tested a simple program reading from a dev Kinesis

Re: Spark streaming with Kinesis broken?

2015-12-10 Thread Brian London
Yes, it worked in the 1.6 branch as of commit db5165246f2888537dd0f3d4c5a515875c7358ed. That makes it much less serious of an issue, although it would be nice to know what the root cause is to avoid a regression. On Thu, Dec 10, 2015 at 4:03 PM Burak Yavuz wrote: > I've

Re: Spark streaming with Kinesis broken?

2015-12-10 Thread Burak Yavuz
I've noticed this happening when there was some dependency conflicts, and it is super hard to debug. It seems that the KinesisClientLibrary version in Spark 1.5.2 is 1.3.0, but it is 1.2.1 in Spark 1.5.1. I feel like that seems to be the problem... Brian, did you verify that it works with the

Re: Spark streaming with Kinesis broken?

2015-12-10 Thread Nick Pentreath
Yup also works for me on master branch as I've been testing DynamoDB Streams integration. In fact works with latest KCL 1.6.1 also which I was using. So theKCL version does seem like it could be the issue - somewhere along the line an exception must be getting swallowed. Though the tests

Re: Spark streaming with Kinesis broken?

2015-12-10 Thread Burak Yavuz
I don't think the Kinesis tests specifically ran when that was merged into 1.5.2 :( https://github.com/apache/spark/pull/8957 https://github.com/apache/spark/commit/883bd8fccf83aae7a2a847c9a6ca129fac86e6a3 AFAIK pom changes don't trigger the Kinesis tests. Burak On Thu, Dec 10, 2015 at 8:09 PM,

Spark streaming with Kinesis broken?

2015-12-10 Thread Brian London
Has anyone managed to run the Kinesis demo in Spark 1.5.2? The Kinesis ASL that ships with 1.5.2 appears to not work for me although 1.5.1 is fine. I spent some time with Amazon earlier in the week and the only thing we could do to make it work is to change the version to 1.5.1. Can someone

Re: Spark streaming with Kinesis broken?

2015-12-10 Thread Nick Pentreath
Yeah also the integration tests need to be specifically run - I would have thought the contributor would have run those tests and also tested the change themselves using live Kinesis :( — Sent from Mailbox On Fri, Dec 11, 2015 at 6:18 AM, Burak Yavuz wrote: > I don't

Re: Having problem with Spark streaming with Kinesis

2014-12-19 Thread Ashrafuzzaman
-kinesis-integration.html Here in the figure[spark streaming kinesis architecture], it seems like one node should be able to take on more than one shards. A.K.M. Ashrafuzzaman Lead Software Engineer NewsCred http://www.newscred.com/ (M) 880-175-5592433 Twitter https://twitter.com

Re: Having problem with Spark streaming with Kinesis

2014-12-14 Thread Aniket Bhatnagar
...@gmail.com wrote: Thanks Aniket, The trick is to have the #workers = #shards + 1. But I don’t know why is that. http://spark.apache.org/docs/latest/streaming-kinesis-integration.html Here in the figure[spark streaming kinesis architecture], it seems like one node should be able to take

Re: Having problem with Spark streaming with Kinesis

2014-12-13 Thread A.K.M. Ashrafuzzaman
Thanks Aniket, The trick is to have the #workers = #shards + 1. But I don’t know why is that. http://spark.apache.org/docs/latest/streaming-kinesis-integration.html Here in the figure[spark streaming kinesis architecture], it seems like one node should be able to take on more than one shards

Re: Having problem with Spark streaming with Kinesis

2014-12-03 Thread A.K.M. Ashrafuzzaman
...@gmail.com wrote: Hi guys, When we are using Kinesis with 1 shard then it works fine. But when we use more that 1 then it falls into an infinite loop and no data is processed by the spark streaming. In the kinesis dynamo DB, I can see that it keeps increasing the leaseCounter. But it do

Having problem with Spark streaming with Kinesis

2014-11-26 Thread A.K.M. Ashrafuzzaman
Hi guys, When we are using Kinesis with 1 shard then it works fine. But when we use more that 1 then it falls into an infinite loop and no data is processed by the spark streaming. In the kinesis dynamo DB, I can see that it keeps increasing the leaseCounter. But it do start processing. I am

Re: Having problem with Spark streaming with Kinesis

2014-11-26 Thread Akhil Das
. In the kinesis dynamo DB, I can see that it keeps increasing the leaseCounter. But it do start processing. I am using, scala: 2.10.4 java version: 1.8.0_25 Spark: 1.1.0 spark-streaming-kinesis-asl: 1.1.0 A.K.M. Ashrafuzzaman Lead Software Engineer NewsCred http://www.newscred.com/ (M

Re: Having problem with Spark streaming with Kinesis

2014-11-26 Thread Aniket Bhatnagar
loop and no data is processed by the spark streaming. In the kinesis dynamo DB, I can see that it keeps increasing the leaseCounter. But it do start processing. I am using, scala: 2.10.4 java version: 1.8.0_25 Spark: 1.1.0 spark-streaming-kinesis-asl: 1.1.0 A.K.M. Ashrafuzzaman Lead

Re: Having problem with Spark streaming with Kinesis

2014-11-26 Thread Aniket Bhatnagar
use more that 1 then it falls into an infinite loop and no data is processed by the spark streaming. In the kinesis dynamo DB, I can see that it keeps increasing the leaseCounter. But it do start processing. I am using, scala: 2.10.4 java version: 1.8.0_25 Spark: 1.1.0 spark-streaming

Spark Streaming with Kinesis

2014-10-29 Thread Harold Nguyen
Hi all, I followed the guide here: http://spark.apache.org/docs/latest/streaming-kinesis-integration.html But got this error: Exception in thread main java.lang.NoClassDefFoundError: com/amazonaws/auth/AWSCredentialsProvider Would you happen to know what dependency or jar is needed ? Harold

Re: Spark Streaming with Kinesis

2014-10-29 Thread Harold Nguyen
Hi again, After getting through several dependencies, I finally got to this non-dependency type error: Exception in thread main java.lang.NoSuchMethodError:

Re: Spark Streaming with Kinesis

2014-10-29 Thread Matt Chu
I haven't tried this myself yet, but this sounds relevant: https://github.com/apache/spark/pull/2535 Will be giving this a try today or so, will report back. On Wednesday, October 29, 2014, Harold Nguyen har...@nexgate.com wrote: Hi again, After getting through several dependencies, I