evolving. So if anyone is interested,
> please support the project.
>
> --
> Lingzhe Sun
> Hirain Technologies
>
>
> *From:* Mich Talebzadeh
> *Date:* 2023-04-11 02:06
> *To:* Rajesh Katkar
> *CC:* user
> *Subject:* Re: spark streami
support
the project.
Lingzhe Sun
Hirain Technologies
From: Mich Talebzadeh
Date: 2023-04-11 02:06
To: Rajesh Katkar
CC: user
Subject: Re: spark streaming and kinesis integration
What I said was this"In so far as I know k8s does not support spark structured
streaming?"
So it is an ope
;>
>>
>> *From:* Mich Talebzadeh
>> *Date:* 2023-04-11 02:06
>> *To:* Rajesh Katkar
>> *CC:* user
>> *Subject:* Re: spark streaming and kinesis integration
>> What I said was this
>> "In so far as I know k8s does not support spark structured stre
s interested,
> please support the project.
>
> --
> Lingzhe Sun
> Hirain Technologies
>
>
> *From:* Mich Talebzadeh
> *Date:* 2023-04-11 02:06
> *To:* Rajesh Katkar
> *CC:* user
> *Subject:* Re: spark streaming and kinesis integration
nt to read/write to kinesis streams using k8s
Officially I could not find the connector or reader for kinesis from spark like
it has for kafka.
Checking here if anyone used kinesis and spark streaming combination ?
On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, wrote:
Hi Rajesh,
What is th
; disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar
>>> wrote:
>>>
>>>> Use case
t;>
>> On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar
>> wrote:
>>
>>> Use case is , we want to read/write to kinesis streams using k8s
>>> Officially I could not find the connector or reader for kinesis from
>>> spark like it has for kafka.
>>>
le for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar
> wrote:
>
>> Use case is , we want to read/write to kinesis streams using k8s
>> Officially I could not find the connector or reader
Use case is , we want to read/write to kinesis streams using k8s
Officially I could not find the connector or reader for kinesis from spark
like it has for kafka.
Checking here if anyone used kinesis and spark streaming combination ?
On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh,
wrote:
>
kar
Cc: u...@spark.incubator.apache.org
Subject: Re: spark streaming and kinesis integration
⚠ [EXTERNAL EMAIL]: Use Caution
Do you have a high level diagram of the proposed solution?
In so far as I know k8s does not support spark structured streaming?
Mich Talebzadeh,
Lead Solutions
kinesis from spark
> like it has for kafka.
>
> Checking here if anyone used kinesis and spark streaming combination ?
>
> On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh,
> wrote:
>
>> Hi Rajesh,
>>
>> What is the use case for Kinesis here? I have not used it pe
elying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar wrote:
> Hi Spark Team,
>
> We need to read/write the kinesis streams using
Hi Spark Team,
We need to read/write the kinesis streams using spark streaming.
We checked the official documentation -
https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
It does not mention kinesis connector. Alternative is -
https://github.com/qubole/kinesis-sql which
16 at 1:59 PM, Shushant Arora <shushantaror...@gmail.com>
wrote:
> Hi
>
> Thanks.
> Have a doubt on spark streaming kinesis consumer. Say I have a batch time
> of 500 ms and kiensis stream is partitioned on userid(uniformly
> distributed).But since IdleTimeBetweenReadsIn
Hi
Thanks.
Have a doubt on spark streaming kinesis consumer. Say I have a batch time
of 500 ms and kiensis stream is partitioned on userid(uniformly
distributed).But since IdleTimeBetweenReadsInMillis is set to 1000ms so
Spark receiver nodes will fetch the data at interval of 1 second and store
Mon, Nov 14, 2016 at 5:43 PM, Takeshi Yamamuro <linguin@gmail.com
>>> > wrote:
>>>
>>>> Hi,
>>>>
>>>> The time interval can be controlled by `IdleTimeBetweenReadsInMillis`
>>>> in KinesisClientLibConfiguration though,
>>
configurable in the current implementation.
>>>
>>> The detail can be found in;
>>> https://github.com/apache/spark/blob/master/external/kinesis
>>> -asl/src/main/scala/org/apache/spark/streaming/kinesis/
>>> KinesisReceiver.scala#L152
>>>
&g
rable in the current implementation.
>>
>> The detail can be found in;
>> https://github.com/apache/spark/blob/master/external/kinesis
>> -asl/src/main/scala/org/apache/spark/streaming/kinesis
>> /KinesisReceiver.scala#L152
>>
>&
ntLibConfiguration
> though,
> it is not configurable in the current implementation.
>
> The detail can be found in;
> https://github.com/apache/spark/blob/master/external/
> kinesis-asl/src/main/scala/org/apache/spark/streaming/
> kinesis/KinesisReceiver.scala#L152
>
> //
/streaming/kinesis/KinesisReceiver.scala#L152
// maropu
On Sun, Nov 13, 2016 at 12:08 AM, Shushant Arora <shushantaror...@gmail.com>
wrote:
> *Hi *
>
> *is **spark.streaming.blockInterval* for kinesis input stream is
> hardcoded to 1 sec or is it configurable ? Time interval
*Hi *
*is **spark.streaming.blockInterval* for kinesis input stream is hardcoded
to 1 sec or is it configurable ? Time interval at which receiver fetched
data from kinesis .
Means stream batch interval cannot be less than *spark.streaming.blockInterval
and this should be configrable , Also is
afka in kinesis spark streaming?
>
> Is there any limitation on interval checkpoint - minimum of 1second in
> spark streaming with kinesis. But as such there is no limit on checkpoint
> interval in KCL side ?
>
> Thanks
>
> On Tue, Oct 25, 2016 at 8:36 AM, Takeshi Yamam
Hi
By receicer I meant spark streaming receiver architecture- means worker
nodes are different than receiver nodes. There is no direct consumer/low
level consumer like of Kafka in kinesis spark streaming?
Is there any limitation on interval checkpoint - minimum of 1second in
spark streaming
Has anyone worked with AWS Kinesis and retrieved data from it using Spark
Streaming? I am having issues where it’s returning no data. I can connect to
the Kinesis stream and describe using Spark. Is there something I’m missing?
Are there specific IAM security settings needed? I just simply
heckpoint the sequence no using some api.
>
>
>
> On Tue, Oct 25, 2016 at 7:07 AM, Takeshi Yamamuro <linguin@gmail.com>
> wrote:
>
>> Hi,
>>
>> The only thing you can do for Kinesis checkpoints is tune the interval of
>> them.
>> https://github.com/apach
gt; replicated across executors.
> However, all the executors that have the replicated data crash,
> IIUC the dataloss occurs.
>
> // maropu
>
> On Mon, Oct 24, 2016 at 4:43 PM, Shushant Arora <shushantaror...@gmail.com
> > wrote:
>
>> Does spark streaming c
Hi,
The only thing you can do for Kinesis checkpoints is tune the interval of
them.
https://github.com/apache/spark/blob/master/external/
kinesis-asl/src/main/scala/org/apache/spark/streaming/
kinesis/KinesisUtils.scala#L68
Whether the dataloss occurs or not depends on the storage level you set
Does spark streaming consumer for kinesis uses Kinesis Client Library and
mandates to checkpoint the sequence number of shards in dynamo db.
Will it lead to dataloss if consumed datarecords are not yet processed and
kinesis checkpointed the consumed sequenece numbers in dynamo db and spark
0, canceled 0, ignored 0, pending 0
All tests passed.
So this is a regression in Spark Streaming Kinesis 1.5.2 - @Brian can you
file a JIRA for this?
@dev-list, since KCL brings in AWS SDK dependencies itself, is it necessary
to declare an explicit dependency on aws-java-sdk in the Kinesis POM
>>> - shutdown should checkpoint if the reason is TERMINATE
>>> - shutdown should not checkpoint if the reason is something other than
>>> TERMINATE
>>> - retry success on first attempt
>>> - retry success on second attempt after a Kinesis throttling exception
>
a Kinesis throttling exception
> - retry success on second attempt after a Kinesis dependency exception
> - retry failed after a shutdown exception
> - retry failed after an invalid state exception
> - retry failed after unexpected exception
> - retry failed after exhausing all retries
> Ru
reason is TERMINATE
>> - shutdown should not checkpoint if the reason is something other than
>> TERMINATE
>> - retry success on first attempt
>> - retry success on second attempt after a Kinesis throttling exception
>> - retry success on second attempt after a Kinesis
Nick's symptoms sound identical to mine. I should mention that I just
pulled the latest version from github and it seems to be working there. To
reproduce:
1. Download spark 1.5.2 from http://spark.apache.org/downloads.html
2. build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0
Hi Nick,
Just to be sure: don't you see some ClassCastException in the log ?
Thanks,
Regards
JB
On 12/10/2015 07:56 PM, Nick Pentreath wrote:
Could you provide an example / test case and more detail on what issue
you're facing?
I've just tested a simple program reading from a dev Kinesis
Yes, it worked in the 1.6 branch as of commit
db5165246f2888537dd0f3d4c5a515875c7358ed. That makes it much less serious
of an issue, although it would be nice to know what the root cause is to
avoid a regression.
On Thu, Dec 10, 2015 at 4:03 PM Burak Yavuz wrote:
> I've
I've noticed this happening when there was some dependency conflicts, and
it is super hard to debug.
It seems that the KinesisClientLibrary version in Spark 1.5.2 is 1.3.0, but
it is 1.2.1 in Spark 1.5.1.
I feel like that seems to be the problem...
Brian, did you verify that it works with the
Yup also works for me on master branch as I've been testing DynamoDB Streams
integration. In fact works with latest KCL 1.6.1 also which I was using.
So theKCL version does seem like it could be the issue - somewhere along the
line an exception must be getting swallowed. Though the tests
I don't think the Kinesis tests specifically ran when that was merged into
1.5.2 :(
https://github.com/apache/spark/pull/8957
https://github.com/apache/spark/commit/883bd8fccf83aae7a2a847c9a6ca129fac86e6a3
AFAIK pom changes don't trigger the Kinesis tests.
Burak
On Thu, Dec 10, 2015 at 8:09 PM,
Has anyone managed to run the Kinesis demo in Spark 1.5.2? The Kinesis ASL
that ships with 1.5.2 appears to not work for me although 1.5.1 is fine. I
spent some time with Amazon earlier in the week and the only thing we could
do to make it work is to change the version to 1.5.1. Can someone
Yeah also the integration tests need to be specifically run - I would have
thought the contributor would have run those tests and also tested the change
themselves using live Kinesis :(
—
Sent from Mailbox
On Fri, Dec 11, 2015 at 6:18 AM, Burak Yavuz wrote:
> I don't
-kinesis-integration.html
Here in the figure[spark streaming kinesis architecture], it seems like
one node should be able to take on more than one shards.
A.K.M. Ashrafuzzaman
Lead Software Engineer
NewsCred http://www.newscred.com/
(M) 880-175-5592433
Twitter https://twitter.com
...@gmail.com wrote:
Thanks Aniket,
The trick is to have the #workers = #shards + 1. But I don’t know why is
that.
http://spark.apache.org/docs/latest/streaming-kinesis-integration.html
Here in the figure[spark streaming kinesis architecture], it seems like
one node should be able to take
Thanks Aniket,
The trick is to have the #workers = #shards + 1. But I don’t know why is that.
http://spark.apache.org/docs/latest/streaming-kinesis-integration.html
Here in the figure[spark streaming kinesis architecture], it seems like one
node should be able to take on more than one shards
...@gmail.com wrote:
Hi guys,
When we are using Kinesis with 1 shard then it works fine. But when we use
more that 1 then it falls into an infinite loop and no data is processed by
the spark streaming. In the kinesis dynamo DB, I can see that it keeps
increasing the leaseCounter. But it do
Hi guys,
When we are using Kinesis with 1 shard then it works fine. But when we use more
that 1 then it falls into an infinite loop and no data is processed by the
spark streaming. In the kinesis dynamo DB, I can see that it keeps increasing
the leaseCounter. But it do start processing.
I am
. In the kinesis dynamo DB, I can see that it keeps
increasing the leaseCounter. But it do start processing.
I am using,
scala: 2.10.4
java version: 1.8.0_25
Spark: 1.1.0
spark-streaming-kinesis-asl: 1.1.0
A.K.M. Ashrafuzzaman
Lead Software Engineer
NewsCred http://www.newscred.com/
(M
loop and no data is processed by
the spark streaming. In the kinesis dynamo DB, I can see that it keeps
increasing the leaseCounter. But it do start processing.
I am using,
scala: 2.10.4
java version: 1.8.0_25
Spark: 1.1.0
spark-streaming-kinesis-asl: 1.1.0
A.K.M. Ashrafuzzaman
Lead
use more that 1 then it falls into an infinite loop and no data is
processed by the spark streaming. In the kinesis dynamo DB, I can see that
it keeps increasing the leaseCounter. But it do start processing.
I am using,
scala: 2.10.4
java version: 1.8.0_25
Spark: 1.1.0
spark-streaming
Hi all,
I followed the guide here:
http://spark.apache.org/docs/latest/streaming-kinesis-integration.html
But got this error:
Exception in thread main java.lang.NoClassDefFoundError:
com/amazonaws/auth/AWSCredentialsProvider
Would you happen to know what dependency or jar is needed ?
Harold
Hi again,
After getting through several dependencies, I finally got to this
non-dependency type error:
Exception in thread main java.lang.NoSuchMethodError:
I haven't tried this myself yet, but this sounds relevant:
https://github.com/apache/spark/pull/2535
Will be giving this a try today or so, will report back.
On Wednesday, October 29, 2014, Harold Nguyen har...@nexgate.com wrote:
Hi again,
After getting through several dependencies, I
51 matches
Mail list logo