from:"Robin Moffatt"

[ANNOUNCE] Call for Speakers is open for Current 2022: The Next Generation of Kafka Summit

2022-05-24 Thread Robin Moffatt

Hi everyone,

We’re very excited to announce our Call for Speakers for Current 2022: The
Next Generation of Kafka Summit!

With the permission of the ASF, Current will include Kafka Summit as part
of the event.

We’re looking for talks about all aspects of event-driven design, streaming
technology, and real-time systems. Think about Apache Kafka® and similar
technologies, and work outwards from there. Whether it’s a data engineering
talk with real-time data, software engineering with message brokers, or
event-driven architectures—if there’s data in motion, then it’s going to be
relevant.

The talk tracks are as follows:

- Developing Real-Time Applications
- Streaming Technologies
- Fun and Geeky
- Architectures You’ve Always Wondered About
- People & Culture
- Data Development Life Cycle (including SDLC for data, data mesh,
governance, schemas)
- Case Studies
- Operations and Observability
- Pipelines Done Right
- Real-Time Analytics
- Event Streaming in Academia and Beyond

You can find the call for speakers at https://sessionize.com/current-2022/,
and a blog detailing the process (and some tips especially for new
speakers) at
https://www.confluent.io/blog/how-to-be-a-speaker-at-current-2022-the-next-kafka-summit/.
If you have any questions about submitting I would be pleased to answer
them - you can contact me directly at ro...@confluent.io.

The call for speakers closes at June 26, 23:59 CT.

Thanks,

Robin Moffatt
Program Committee Chair

Re: Kafka Question

2022-05-03 Thread Robin Moffatt

Kafka itself includes Kafka Streams (
https://kafka.apache.org/31/documentation/streams/), so you can do this
processing in Kafka. There's a Filter transformation that would be a good
place to start:
https://kafka.apache.org/31/documentation/streams/developer-guide/dsl-api.html#stateless-transformations


-- 

Robin Moffatt | Principal Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 3 May 2022 at 03:09, Liam Clarke-Hutchinson 
wrote:

> Hi Emily,
>
> Nope, Kafka doesn't have that capability built in, it's just a distributed
> log that's great for streaming events. However, you can easily write a
> program that consumes those events from Kafka and then does what you want
> :)
>
> Cheers,
>
> Liam
>
> On Tue, 3 May 2022 at 06:30, Emily Schepisi
>  wrote:
>
> > Hello,
> >
> > I have a question about Kafka. If I put an upper and lower control limit
> on
> > the data, and the log records an event where the upper or lower control
> > limit is breached, will Kafka be able to send a notification via email or
> > text message to the user?
> >
> > Example: I'm tracking the daily temperature and set the upper control
> limit
> > at 80 degrees and the lower control limit at 50 degrees. The event log on
> > Kafka recorded the temperature on Monday at 90 degrees, so it's higher
> than
> > the upper control limit. Does Kafka have the capability to send a text
> > message or email to let me know that the temperature is outside of the
> > control limit?
> >
> > Thank you,
> >
> > Emily
> >
>

Re: Kafka Connect - offset.storage.topic reuse across clusters

2022-03-30 Thread Robin Moffatt

Hi Jordan,

Is there a good reason for wanting to do this? I can think of multiple
reasons why you shouldn't do this even if technically it works in some
cases.
Or it's just curiosity as to whether you can/should?

thanks, Robin.


-- 

Robin Moffatt | Principal Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 30 Mar 2022 at 13:36, Jordan Wyatt  wrote:

> Hi,
>
> I've recently been experimenting with setting the values of the `offset,`
> `storage` and `status` topics within Kafka Connect.
>
> I'm aware from various sources (Robin Moffatt blogs, StackOverflow,
> Confluent Kafka Connect docs) that these topics should not be shared across
> different connect **clusters**.  e.g for each  unique set of workers with a
> given `group.id`, a unique set of internal storage topics should be used.
>
> These discussions and documentations usually talk about sharing all three
> topics at once, however, I am interested in reusing only the offset storage
> topic. I am struggling to find the risks of sharing this offset topic
> between different connect clusters.
>
> I'm aware of issues with sharing the config and status topics from blogs
> and my own testing (clusters can end up running connectors from other
> clusters, for example), but I cannot find a case for not sharing the offset
> topic despite guidance to avoid this.
>
> The use cases I am interested in are:
>
> 1. Sharing an offset topic between clusters, but never in parallel.
>
>
> *e.g cluster 1 running connector A uses the offset topic, cluster 1 and
> connector A are deleted, then cluster 2 running connector B is created uses
> the offset topic. *
>
> 2. As above, but using the offset topic in parallel.
>
> As the offset.stroage topic is keyed by connector name (from the source
> connectors I've tried) I do not understand the risk of both of the above
> cases **unless** > 1  connector exists with the same name in separate
> clusters, as there would then be the risk of key collision as group.id is
> not referenced in the offset topic keys.
>
> Any insights into why sharing the offset topic between clusters for the
> cases described would be greatly appreciated, thank you.
>

Re: using cluster Kafka with a F5 Vip

2022-02-22 Thread Robin Moffatt

I *think* it should work - you just need to get your advertised.listeners
set up correctly. I wrote a blog about this that should help you understand
the config more:
www.confluent.io/blog/kafka-client-cannot-connect-to-broker-on-aws-on-docker-etc/


-- 

Robin Moffatt | Staff Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 22 Feb 2022 at 19:54, DESSEAUX Samuel (Gaz Réseau Distribution
France)  wrote:

> Hello
>
> My need may be tricky but i've the following problem.
>
> I have a cluster Kafka behind a F5 Vip and my filebeat collector have to
> sens data to topics.
>
> Actually,i can't write anything in the topics.
> To give more details,the main goal of the vip is to secure connections
> between servers and the vip
>
> So,is it really possible to do this architecture ?
> If yes,how can i do this?
>
> Best regards
>
> Samuel Desseaux
>
> Télécharger Outlook pour Android<https://aka.ms/AAb9ysg>
>
> « Ce message est confidentiel et destiné à l'usage du (des) seul(s)
> destinataire(s) concerné(s). Il peut également contenir des informations à
> usage restreint, soumises à droits d'auteur ou à d'autres dispositions
> légales. Si vous l'avez reçu par erreur, nous vous prions de bien vouloir
> nous en informer par retour et de l'effacer de votre système. La copie du
> message et la communication de son contenu à quelque personne que ce soit
> sont interdites. La transmission erronée de ce message n'entraîne ni la
> renonciation ni la levée de la confidentialité et du secret professionnel.
>
> Tous les messages envoyés et reçus par GRDF peuvent faire l'objet de
> contrôles visant à garantir le respect des directives internes, protéger
> les intérêts de l'entreprise et éliminer les éventuels logiciels dangereux.
> Les messages électroniques ne sont pas sécurisés et sont susceptibles de
> comporter des erreurs puisqu'ils peuvent être interceptés, modifiés,
> perdus, supprimés ou contenir des virus. Toute personne communiquant avec
> notre entreprise par message électronique accepte ces risques. Les
> délégations de pouvoirs et d'autorité peuvent être vérifiées et sont
> disponibles sur demande »
>
>

Kafka Summit London 2022 - Call for Papers closes soon

2021-12-07 Thread Robin Moffatt

Kafka Summit[1] is the fandabidoziest[2] conference dedicated to
Apache Kafka® and event streaming. The Call for Papers[3] (CfP) closes
in less than two weeks - and we would love to hear from Kafka users,
architects, operators, and anyone with an interesting Kafka story to
tell.

To get an idea of the kind of talks that Kafka Summit audiences enjoy
check out the programmes from previous Summits[4]. Whether it’s tales
from the trenches, abstract architectural anecdotes, or scintillating
stories of streams, we want to hear from 👉️ YOU 👈️.

We’re keen for involvement from everyone in the Kafka community - you
don’t need to be a seasoned speaker to submit. In fact, we would love
to hear from more first-time speakers.

👉️ To support speakers both old and new we’ve got resources 📕[5]
🎥[6] on writing a good abstract, and would be delighted to offer help
reviewing it before you submit - we’re holding an “office hours” next
week [7]. If your talk is accepted and you’d like help preparing and
rehearsing your talk we’d be happy to help support that too.

Remember: the CfP closes in less than two weeks (2021-12-20 at 23:59
GMT) - so submit without delay! If you have any questions, please do
feel free to contact me directly - ro...@confluent.io.

thanks,
Robin Moffatt

P.S. Here’s a little-known fact about CfPs: at this stage, you don’t
need to have written the talk itself - just the abstract. If your talk
is accepted then you need to write it :)


[1] https://www.kafka-summit.org/
[2] https://www.collinsdictionary.com/dictionary/english/fandabidozi
[3] https://sessionize.com/kafka-summit-london-2022/
[4] https://www.kafka-summit.org/past-events
[5] 
https://rmoff.net/2020/01/16/how-to-win-or-at-least-not-suck-at-the-conference-abstract-submission-game/
[6] https://www.youtube.com/watch?v=N0g3QoCuqH4
[7] 
https://dev.to/confluentinc/kafka-summit-office-hours-for-abstract-writing-2kgc

Kafka Summit is this week

2021-09-13 Thread Robin Moffatt

Kafka Summit Americas 2021 starts tomorrow - register for free at
https://www.myeventi.events/kafka21/na/ and tune in for tons of
great Apache Kafka content!


-- 

Robin Moffatt | Staff Developer Advocate | ro...@confluent.io | @rmoff

Re: [ANNOUNCE] New Kafka PMC Member: Konstantine Karantasis

2021-06-22 Thread Robin Moffatt

Congratulations Konstantine!


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Mon, 21 Jun 2021 at 16:28, Mickael Maison  wrote:

> Hi,
>
> It's my pleasure to announce that Konstantine Karantasis is now a
> member of the Kafka PMC.
>
> Konstantine has been a Kafka committer since Feb 2020. He has remained
> active in the community since becoming a committer.
>
> Congratulations Konstantine!
>
> Mickael, on behalf of the Apache Kafka PMC
>

Re: Mirror Maker2 Event filter capability from topic before replication

2021-05-07 Thread Robin Moffatt

There's also this
https://forum.confluent.io/t/kafka-connect-jmespath-expressive-content-based-record-filtering/1104
which is worth checking out - I came across it after recording that video
unfortunately but it looks like a useful predicate implementation.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 7 May 2021 at 11:30, Tom Bentley  wrote:

> Just to be clear: It's only the Predicate implementation for filtering on a
> JSON key that you'd need to write. MM2 already supports SMTs and
> predicates.
>
> Alternatively
>
> https://rmoff.net/2020/12/22/twelve-days-of-smt-day-11-predicate-and-filter/#_filtering_based_on_the_contents_of_a_message
> might work for you, but it's not an Apache Kafka feature currently. I'm not
> aware that anyone is currently planning on contributing a JSONPath-based
> Predicate to Apache Kafka.
>
> On Fri, May 7, 2021 at 10:06 AM Anup Tiwari 
> wrote:
>
> > Hi Tom,
> >
> > Thanks for quick reply. As per your comments, it seems we will have to
> > build something of our own together to work with MM2.
> > So just wanted to confirm that it is not an inbuilt feature.. right? Like
> > confluent replicator provide this.
> > Also are we planning to add this in near future?
> >
> >
> > On Fri, 7 May 2021 13:33 Tom Bentley,  wrote:
> >
> > > Hi Anup,
> > >
> > > This should be possible using the Filter SMT in Kafka Connect together
> > with
> > > a custom Predicate which you'd have to write yourself and provide on
> the
> > > plugin path.
> > >
> > > See http://kafka.apache.org/documentation.html#connect_predicates and
> > >
> > >
> >
> https://rmoff.net/2020/12/22/twelve-days-of-smt-day-11-predicate-and-filter/
> > > for info about Filter and predicates, and
> > >
> > >
> >
> https://stackoverflow.com/questions/61742209/smt-does-not-work-when-added-to-connect-mirror-maker-properties-mirrormaker-2
> > > for a MM2-specific example of SMTs
> > >
> > > Kind regards,
> > >
> > > Tom
> > >
> > > On Fri, May 7, 2021 at 8:49 AM Anup Tiwari 
> > wrote:
> > >
> > > > Hi Team,
> > > >
> > > > I have a topic which contains JSON data and multiple user click
> > events. I
> > > > just wanted to know if we can filter out events based on keys of JSON
> > > > before replicating it to some other kafka.
> > > >
> > > > Regards,
> > > > Anup Tiwari
> > > >
> > >
> >
>

Re: [ANNOUNCE] New Kafka PMC Member: Randall Hauch

2021-04-18 Thread Robin Moffatt

Great news, congratulations Randall!


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Sat, 17 Apr 2021 at 00:45, Matthias J. Sax  wrote:

> Hi,
>
> It's my pleasure to announce that Randall Hauch in now a member of the
> Kafka PMC.
>
> Randall has been a Kafka committer since Feb 2019. He has remained
> active in the community since becoming a committer.
>
>
>
> Congratulations Randall!
>
>  -Matthias, on behalf of Apache Kafka PMC
>

Re: Confluence Control Center On Windows 10

2021-03-26 Thread Robin Moffatt

Satendra - there was a new blog published just today about
running Confluent Control Center on Windows:
https://www.confluent.io/blog/set-up-and-run-kafka-on-windows-and-wsl-2


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 26 Mar 2021 at 10:07, Robin Moffatt  wrote:

> You can use WSL2:
> https://www.confluent.io/blog/set-up-and-run-kafka-on-windows-linux-wsl-2
>
>
> --
>
> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>
>
> On Fri, 26 Mar 2021 at 05:09, Satendra Negi 
> wrote:
>
>> Hello Guys,
>>
>> Is there any way to run the confluent kafka control center on windows ?
>> the
>> binaries provided by confluent seems not to support the windows 10 by
>> default.
>>
>> Thanks in advance.
>>
>> --
>>
>>
>>
>> *Thanks & Regards*
>>
>

Re: Confluence Control Center On Windows 10

2021-03-26 Thread Robin Moffatt

You can use WSL2:
https://www.confluent.io/blog/set-up-and-run-kafka-on-windows-linux-wsl-2


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 26 Mar 2021 at 05:09, Satendra Negi 
wrote:

> Hello Guys,
>
> Is there any way to run the confluent kafka control center on windows ? the
> binaries provided by confluent seems not to support the windows 10 by
> default.
>
> Thanks in advance.
>
> --
>
>
>
> *Thanks & Regards*
>

Re: Kafka Connect Distributed Mode Doubt

2021-03-23 Thread Robin Moffatt

1. Kafka Connect standalone workers have their connectors configured based
on properties file(s) passed on the command line at startup. You cannot use
REST to add or remove them
2. Correct, Standalone workers are isolated instances that cannot share
load with other workers
3. Correct, Distributed workers in a cluster will distributed connector
tasks across the available workers and rebalance on the loss of a worker
4. Correct, in Distributed mode you have to use the REST interface (see
https://rmoff.dev/kafka-connect-rest-api), you cannot use properties file
to configure connectors. n.b. You still have a properties file to configure
the worker itself.

-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

On Tue, 23 Mar 2021 at 02:02, Himanshu Shukla 
wrote:

> Hi All,
>
> I am having these below understanding regarding Kafka connect.
>
> 1. Kafka Connect Standalone has the provision of either running the job
> from the command line or we can use the REST interface also to
> add/update/delete a connector job.
>
> 2. Standalone mode won't be running like a clustered environment like it
> won't be sharing the load if more than one instance is running. Each
> Instance will be responsible for it's own configured connector job and load
> sharing won't happen.
>
> 3. Distributed mode will share the load among the instances and will
> provide fault tolerance, dynamic scaling, etc.
>
> 4. Distribute mode only provides the connector job configuration through
> the REST interface. There is no other option like reading the connector job
> config from the property file or reading it from JDBC etc.
>
> Please confirm, if the above understanding is correct.
>
> --
> Regards,
> Himanshu Shukla
>

Re: Docker images

2021-03-16 Thread Robin Moffatt

There is a similar discussion about Kafka Connect images here:
https://issues.apache.org/jira/browse/KAFKA-9774


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 16 Mar 2021 at 16:04, Francois Papon 
wrote:

> Hi,
>
> It would be nice to push an official version on
> https://hub.docker.com/u/apache <https://hub.docker.com/u/apache> :)
>
> regards,
>
> François
> fpa...@apache.org
>
> Le 16/03/2021 à 15:34, Joris Peeters a écrit :
> > There's an official Confluent version:
> > https://hub.docker.com/r/confluentinc/cp-kafka/
> >
> > On Tue, Mar 16, 2021 at 2:24 PM Otar Dvalishvili 
> > wrote:
> >
> >> Greetings,
> >> Why there are no official Kafka Docker images?
> >>
>

Re: [ANNOUNCE] New committer: Tom Bentley

2021-03-15 Thread Robin Moffatt

Congratulations Tom!


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Mon, 15 Mar 2021 at 18:00, Mickael Maison  wrote:

> Hi all,
>
> The PMC for Apache Kafka has invited Tom Bentley as a committer, and
> we are excited to announce that he accepted!
>
> Tom first contributed to Apache Kafka in June 2017 and has been
> actively contributing since February 2020.
> He has accumulated 52 commits and worked on a number of KIPs. Here are
> some of the most significant ones:
>KIP-183: Change PreferredReplicaLeaderElectionCommand to use AdminClient
>KIP-195: AdminClient.createPartitions
>KIP-585: Filter and Conditional SMTs
>KIP-621: Deprecate and replace DescribeLogDirsResult.all() and .values()
>KIP-707: The future of KafkaFuture (still in discussion)
>
> In addition, he is very active on the mailing list and has helped
> review many KIPs.
>
> Congratulations Tom and thanks for all the contributions!
>

Re: Kafka Connectors output to topic.

2021-03-11 Thread Robin Moffatt

For what it's worth, that sounds less like a Kafka Connector and more like
a Kafka Streams app.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 11 Mar 2021 at 18:27, Nick Siviglia  wrote:

> Hi Everyone,
>
> I'd like to create a Kafka connector that instead of acting like a source
> or sink will instead do some processing on the data and output to another
> kafka topic. Has anyone done this before? Does anyone see any potential
> drawbacks?
>
> Data is json format at around 150 string and number fields per object. And
> I'm planning on receiving about 2 million a day.
>
> Thanks for any help,
> Nick
>

Re: Filter Kafka Messages

2021-03-10 Thread Robin Moffatt

You can find details of the Filter function and its use here:
https://rmoff.net/2020/12/22/twelve-days-of-smt-day-11-predicate-and-filter/

Something I need to add to this page is a new JMESPath-based Predicate that
I found recently which looks really useful. The author wrote details and
example of it here:
https://forum.confluent.io/t/kafka-connect-jmespath-expressive-content-based-record-filtering/1104


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 10 Mar 2021 at 09:55, raman gugnani 
wrote:

> HI Tom,
>
> For which connector class connector do I need to submit ?
> Can you share some samples for the same?
>
> On Wed, 10 Mar 2021 at 14:32, Tom Bentley  wrote:
>
> > Kafka Connect has the Filter SMT which can be used with a predicate to
> > filter out certain messages, see
> > https://kafka.apache.org/documentation/#connect_predicates
> >
> > Kinds regards,
> >
> > Tom
> >
> > On Wed, Mar 10, 2021 at 8:46 AM raman gugnani <
> ramangugnani@gmail.com>
> > wrote:
> >
> > > HI Team,
> > >
> > > I want to filter out some of the kafka messages from source kafka
> cluster
> > > to destination kafka cluster via some logic.
> > >
> > > Logics to filter
> > >
> > >1. If possible to filter data on the basis of static data.
> > >2. If possible, filter data dynamically based on a custom provider
> > (some
> > >jar or some other way around).
> > >
> > >
> > > Looking to do the same via any of the below features
> > >
> > >1. Kafka mirror
> > >2. Kafka Connect
> > >3. Kafka Streams.
> > >
> > > --
> > > Raman Gugnani
> > >
> >
>
>
> --
> Raman Gugnani
>

Loading delimited data into Apache Kafka - quick & dirty (but effective)

2021-03-01 Thread Robin Moffatt

I wrote a blog about loading data into Apache Kafka using kafkacat. It's
kinda hacky, but it works and was an interesting journey through some other
bash tools too :D

https://rmoff.net/2021/02/26/loading-delimited-data-into-kafka-quick-dirty-but-effective/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

Re: Kafka Security Questions

2021-02-16 Thread Robin Moffatt

You can read about the security available in Apache Kafka here:
https://kafka.apache.org/documentation/#security_ssl


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 16 Feb 2021 at 15:18, Jones, Isaac  wrote:

> Hello,
>
> I have a couple important questions regarding kafka and its security.
>
>
>
>1. Is data encrypted in transit when streaming in kafka?
>2. How does one endpoint get authenticated before data is sent to it?
>
>
>
> If someone can answer/explain this to me that’d be great.
>
>
>
>
>
>
>
> *Isaac Jones*
>
>   *Full Stack Engineer Asc*
>
>   “Concentrate all your thoughts upon the work in hand.
>
>The sun's rays do not burn until brought to a focus.*.*”
>
> -Alexander Graham Bell.
>
>
>
>
>

Re: print.partition=true not working

2021-02-10 Thread Robin Moffatt

FWIW kafkacat will do this no sweat (I realise that doesn't help w.r.t. the
tool you're trying to use, but mentioning it in case :) )

kafkacat -b kafka-broker:9092 \
  -t my_topic_name -C \
  -f '\nKey (%K bytes): %k
  Value (%S bytes): %s
  Timestamp: %T
  Partition: %p
  Offset: %o
  Headers: %h\n'


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 10 Feb 2021 at 17:59, Rovere Lorenzo  wrote:

> Hi
>
>
>
> We are on Kafka version: 2.2.1-kafka-4.1.0
>
> We have some issues when trying to dump some messages from a topic.
>
> Topic describe: Topic:test PartitionCount:3
> ReplicationFactor:2 Configs:message.timestamp.type=LogAppendTime
>
>
>
> Using the kafka-console-consumer I want to print the timestamp and the
> partition besides the content of the topic, so I specified --property
> print.partition=true --property print.timestamp=true in the
> kafka-console-consumer.sh but I only get the timestamp. The partition field
> is always empty. Any reason why this would happen? Thanks
>
>
>
> Lorenzo Rovere
>
>
>
>
> Lorenzo Rovere
>
> Technology Reply
> Via Avogadri, 2
> 31057 - Silea (TV) - ITALY
> phone: +39 0422 1836521
> l.rov...@reply.it
> www.reply.it
>
> [image: Technology Reply]
>

Re: Announcing 🎅🎄#TwelveDaysOfSMT🎄🎅

2020-12-15 Thread Robin Moffatt

Hey Ryanne,

Apologies; I thought as these Single Message Transforms are all part
of Apache Kafka this would be ok, but I am happy to not post any more :)

If anyone does want to see the others, they'll all be linked to and they're
published from https://rmoff.net/categories/twelvedaysofsmt/

thanks, Robin.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Mon, 14 Dec 2020 at 16:30, Ryanne Dolan  wrote:

> Robin, I don't think this is an appropriate use of the user group.
>
> Ryanne
>
> On Mon, Dec 14, 2020 at 10:20 AM Robin Moffatt  wrote:
>
>> 🎅🎄 #TwelveDaysOfSMT 🎄 🎅
>>
>> Day 5️⃣: MaskField
>> https://rmoff.net/2020/12/14/twelve-days-of-smt-day-5-maskfield/
>>
>>
>> --
>>
>> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>>
>>
>> On Fri, 11 Dec 2020 at 17:14, Robin Moffatt  wrote:
>>
>> > 🎅🎄 #TwelveDaysOfSMT 🎄 🎅
>> >
>> > Day 4️⃣: RegexRouter
>> > https://rmoff.net/2020/12/11/twelve-days-of-smt-day-4-regexrouter/
>> >
>> >
>> > --
>> >
>> > Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>> >
>> >
>> > On Thu, 10 Dec 2020 at 17:56, Robin Moffatt  wrote:
>> >
>> >> Here's the third instalment of 🎅🎄 #TwelveDaysOfSMT 🎄 🎅
>> >>
>> >> Flatten :
>> https://rmoff.net/2020/12/10/twelve-days-of-smt-day-3-flatten/
>> >>
>> >>
>> >> --
>> >>
>> >> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io |
>> @rmoff
>> >>
>> >>
>> >> On Thu, 10 Dec 2020 at 09:49, Robin Moffatt 
>> wrote:
>> >>
>> >>> Here's the second instalment of 🎅🎄 #TwelveDaysOfSMT 🎄 🎅
>> >>>
>> >>> Day 2: ValueToKey and ExtractField :
>> >>>
>> https://rmoff.net/2020/12/09/twelve-days-of-smt-day-2-valuetokey-and-extractfield/
>> >>>
>> >>>
>> >>> --
>> >>>
>> >>> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io |
>> @rmoff
>> >>>
>> >>>
>> >>> On Wed, 9 Dec 2020 at 09:07, Robin Moffatt 
>> wrote:
>> >>>
>> >>>> ✨ Do you use Single Message Transforms in Kafka Connect? Or maybe
>> >>>> you've wondered what they are?
>> >>>>
>> >>>> 🎥 ✍️I'm doing a series of videos and blogs about them during
>> December
>> >>>> (#TwelveDaysOfSMT), with the first one out today:
>> >>>>
>> https://rmoff.net/2020/12/08/twelve-days-of-smt-day-1-insertfield-timestamp/
>> >>>>
>> >>>> Reply here with any feedback, any particular SMT you'd like me to
>> >>>> explore, or scenarios to solve :)
>> >>>>
>> >>>>
>> >>>> --
>> >>>>
>> >>>> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io |
>> @rmoff
>> >>>>
>> >>>
>>
>

Re: Announcing 🎅🎄#TwelveDaysOfSMT🎄🎅

2020-12-14 Thread Robin Moffatt

🎅🎄 #TwelveDaysOfSMT 🎄 🎅

Day 5️⃣: MaskField
https://rmoff.net/2020/12/14/twelve-days-of-smt-day-5-maskfield/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 11 Dec 2020 at 17:14, Robin Moffatt  wrote:

> 🎅🎄 #TwelveDaysOfSMT 🎄 🎅
>
> Day 4️⃣: RegexRouter
> https://rmoff.net/2020/12/11/twelve-days-of-smt-day-4-regexrouter/
>
>
> --
>
> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>
>
> On Thu, 10 Dec 2020 at 17:56, Robin Moffatt  wrote:
>
>> Here's the third instalment of 🎅🎄 #TwelveDaysOfSMT 🎄 🎅
>>
>> Flatten :https://rmoff.net/2020/12/10/twelve-days-of-smt-day-3-flatten/
>>
>>
>> --
>>
>> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>>
>>
>> On Thu, 10 Dec 2020 at 09:49, Robin Moffatt  wrote:
>>
>>> Here's the second instalment of 🎅🎄 #TwelveDaysOfSMT 🎄 🎅
>>>
>>> Day 2: ValueToKey and ExtractField :
>>> https://rmoff.net/2020/12/09/twelve-days-of-smt-day-2-valuetokey-and-extractfield/
>>>
>>>
>>> --
>>>
>>> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>>>
>>>
>>> On Wed, 9 Dec 2020 at 09:07, Robin Moffatt  wrote:
>>>
>>>> ✨ Do you use Single Message Transforms in Kafka Connect? Or maybe
>>>> you've wondered what they are?
>>>>
>>>> 🎥 ✍️I'm doing a series of videos and blogs about them during December
>>>> (#TwelveDaysOfSMT), with the first one out today:
>>>> https://rmoff.net/2020/12/08/twelve-days-of-smt-day-1-insertfield-timestamp/
>>>>
>>>> Reply here with any feedback, any particular SMT you'd like me to
>>>> explore, or scenarios to solve :)
>>>>
>>>>
>>>> --
>>>>
>>>> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>>>>
>>>

Re: Announcing 🎅🎄#TwelveDaysOfSMT🎄🎅

2020-12-11 Thread Robin Moffatt

🎅🎄 #TwelveDaysOfSMT 🎄 🎅

Day 4️⃣: RegexRouter
https://rmoff.net/2020/12/11/twelve-days-of-smt-day-4-regexrouter/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 10 Dec 2020 at 17:56, Robin Moffatt  wrote:

> Here's the third instalment of 🎅🎄 #TwelveDaysOfSMT 🎄 🎅
>
> Flatten :https://rmoff.net/2020/12/10/twelve-days-of-smt-day-3-flatten/
>
>
> --
>
> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>
>
> On Thu, 10 Dec 2020 at 09:49, Robin Moffatt  wrote:
>
>> Here's the second instalment of 🎅🎄 #TwelveDaysOfSMT 🎄 🎅
>>
>> Day 2: ValueToKey and ExtractField :
>> https://rmoff.net/2020/12/09/twelve-days-of-smt-day-2-valuetokey-and-extractfield/
>>
>>
>> --
>>
>> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>>
>>
>> On Wed, 9 Dec 2020 at 09:07, Robin Moffatt  wrote:
>>
>>> ✨ Do you use Single Message Transforms in Kafka Connect? Or maybe you've
>>> wondered what they are?
>>>
>>> 🎥 ✍️I'm doing a series of videos and blogs about them during December
>>> (#TwelveDaysOfSMT), with the first one out today:
>>> https://rmoff.net/2020/12/08/twelve-days-of-smt-day-1-insertfield-timestamp/
>>>
>>> Reply here with any feedback, any particular SMT you'd like me to
>>> explore, or scenarios to solve :)
>>>
>>>
>>> --
>>>
>>> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>>>
>>

Re: Announcing 🎅🎄#TwelveDaysOfSMT🎄🎅

2020-12-10 Thread Robin Moffatt

Here's the third instalment of 🎅🎄 #TwelveDaysOfSMT 🎄 🎅

Flatten :https://rmoff.net/2020/12/10/twelve-days-of-smt-day-3-flatten/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 10 Dec 2020 at 09:49, Robin Moffatt  wrote:

> Here's the second instalment of 🎅🎄 #TwelveDaysOfSMT 🎄 🎅
>
> Day 2: ValueToKey and ExtractField :
> https://rmoff.net/2020/12/09/twelve-days-of-smt-day-2-valuetokey-and-extractfield/
>
>
> --
>
> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>
>
> On Wed, 9 Dec 2020 at 09:07, Robin Moffatt  wrote:
>
>> ✨ Do you use Single Message Transforms in Kafka Connect? Or maybe you've
>> wondered what they are?
>>
>> 🎥 ✍️I'm doing a series of videos and blogs about them during December
>> (#TwelveDaysOfSMT), with the first one out today:
>> https://rmoff.net/2020/12/08/twelve-days-of-smt-day-1-insertfield-timestamp/
>>
>> Reply here with any feedback, any particular SMT you'd like me to
>> explore, or scenarios to solve :)
>>
>>
>> --
>>
>> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>>
>

Re: Announcing 🎅🎄#TwelveDaysOfSMT🎄🎅

2020-12-10 Thread Robin Moffatt

Here's the second instalment of 🎅🎄 #TwelveDaysOfSMT 🎄 🎅

Day 2: ValueToKey and ExtractField :
https://rmoff.net/2020/12/09/twelve-days-of-smt-day-2-valuetokey-and-extractfield/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 9 Dec 2020 at 09:07, Robin Moffatt  wrote:

> ✨ Do you use Single Message Transforms in Kafka Connect? Or maybe you've
> wondered what they are?
>
> 🎥 ✍️I'm doing a series of videos and blogs about them during December
> (#TwelveDaysOfSMT), with the first one out today:
> https://rmoff.net/2020/12/08/twelve-days-of-smt-day-1-insertfield-timestamp/
>
> Reply here with any feedback, any particular SMT you'd like me to explore,
> or scenarios to solve :)
>
>
> --
>
> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>

Announcing 🎅🎄#TwelveDaysOfSMT🎄🎅

2020-12-09 Thread Robin Moffatt

✨ Do you use Single Message Transforms in Kafka Connect? Or maybe you've
wondered what they are?

🎥 ✍️I'm doing a series of videos and blogs about them during December
(#TwelveDaysOfSMT), with the first one out today:
https://rmoff.net/2020/12/08/twelve-days-of-smt-day-1-insertfield-timestamp/

Reply here with any feedback, any particular SMT you'd like me to explore,
or scenarios to solve :)


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

Re: KSQL Enrichment issue with converting decimal to string column

2020-09-08 Thread Robin Moffatt

Why is quantity a STRING? Should it be a numeric?


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 8 Sep 2020 at 08:52, Rainer Schamm  wrote:

> Hi all
>
> we are struggling a bit with a fairly simple stream that is supposed to
> map columns from a topic populated by a Source Jdbc Connector into a format
> destined for a Sink Jdbc Connector.
>
> Our tables look something like this:
> (These are not the actual column names we are using though)
>
>
> SOURCE TABLE (DB2)
> -
> name CHAR
> street CHAR
> quantity NUMERIC   <<< decimal type in source avro
>
> DEST TABLE (Oracle)
> --
> nameDest VARCHAR2
> streetDest VARCHAR2
> quantityDest VARCHAR2   <<< string type in sink avro
>
> The problem we are facing is how to cleanly convert the numeric quantity
> value into a varchar2 value for the quantityDest.
>
> CREATE STREAM SOURCE_TABLE_MAPPED AS
> SELECT
> name as nameDest,
> TRIM(street) AS streetDest,
> quanity AS quanityDest,
> FROM SOURCE_TABLE
> EMIT CHANGES;
>
> I can't seem to find any scalar function that allows us to format the
> decimal value to a string, e.g. String.format(„%.2f“, quantiity)
>
> I have tried using a CAST like this: CAST(quantity as string); but the
> generated string looked very strange and was way too long.
>
> So in a nutshell, how can I fix this line:
>
> quanity AS quanityDest,
>
> To convert the quantity field to a quantityDest string field.
>
> As a side note, we are using Avro schemas on the topics of both the source
> and sink side.
>
> Thanks in advance to anyone who can give a suggestion in the right
> direction:)
>
> Regards
> Rainer
>
>
>
>
>
>

Re: Amazon MSK Feeback

2020-09-07 Thread Robin Moffatt

Hi Himanshu,

Have you looked at Confluent Cloud? This gives you fully managed Kafka on
AWS, with additional benefits. You can read more in this article:
https://www.confluent.io/blog/fully-managed-apache-kafka-service/

Disclaimer: I am totally biased, because I work for Confluent :-)

thanks, Robin.

-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

On Mon, 7 Sep 2020 at 10:00, Himanshu Shukla 
wrote:

> Hi,
> We are planning to go for amazon MSK instead of having our own self
> launched cluster on bare EC2 machines.
>
> It would be very helpful, if someone who has used it before or know about
> it, can share the feedback/observations regarding the same.
>
> My main concerns are,
> Is it worth using Amazon MSK?
> Will there be any complexity or problem in the future if we go ahead with
> this?
> --
> Regards,
> Himanshu Shukla
>

Re: [VOTE] KIP-657: Add Customized Kafka Streams Logo

2020-08-19 Thread Robin Moffatt

I echo what Michael says here.

Another consideration is that logos are often shrunk (when used on slides)
and need to work at lower resolution (think: printing swag, stitching
socks, etc) and so whatever logo we come up with needs to not be too fiddly
in the level of detail - something that I think both the current proposed
options will fall foul of IMHO.


On Wed, 19 Aug 2020 at 15:33, Michael Noll  wrote:

> Hi all!
>
> Great to see we are in the process of creating a cool logo for Kafka
> Streams.  First, I apologize for sharing feedback so late -- I just learned
> about it today. :-)
>
> Here's my *personal, subjective* opinion on the currently two logo
> candidates for Kafka Streams.
>
> TL;DR: Sorry, but I really don't like either of the proposed "otter" logos.
> Let me try to explain why.
>
>- The choice to use an animal, regardless of which specific animal,
>seems random and doesn't fit Kafka. (What's the purpose? To show that
>KStreams is 'cute'?) In comparison, the O’Reilly books always have an
>animal cover, that’s their style, and it is very recognizable.  Kafka
>however has its own, different style.  The Kafka logo has clear, simple
>lines to achieve an abstract and ‘techy’ look, which also alludes
> nicely to
>its architectural simplicity. Its logo is also a smart play on the
>Kafka-identifying letter “K” and alluding to it being a distributed
> system
>(the circles and links that make the K).
>- The proposed logos, however, make it appear as if KStreams is a
>third-party technology that was bolted onto Kafka. They certainly, for
> me,
>do not convey the message "Kafka Streams is an official part of Apache
>Kafka".
>- I, too, don't like the way the main Kafka logo is obscured (a concern
>already voiced in this thread). Also, the Kafka 'logo' embedded in the
>proposed KStreams logos is not the original one.
>- None of the proposed KStreams logos visually match the Kafka logo.
>They have a totally different style, font, line art, and color scheme.
>- Execution-wise, the main Kafka logo looks great at all sizes.  The
>style of the otter logos, in comparison, becomes undecipherable at
> smaller
>sizes.
>
> What I would suggest is to first agree on what the KStreams logo is
> supposed to convey to the reader.  Here's my personal take:
>
> Objective 1: First and foremost, the KStreams logo should make it clear and
> obvious that KStreams is an official and integral part of Apache Kafka.
> This applies to both what is depicted and how it is depicted (like font,
> line art, colors).
> Objective 2: The logo should allude to the role of KStreams in the Kafka
> project, which is the processing part.  That is, "doing something useful to
> the data in Kafka".
>
> The "circling arrow" aspect of the current otter logos does allude to
> "continuous processing", which is going in the direction of (2), but the
> logos do not meet (1) in my opinion.
>
> -Michael
>
>
>
>
> On Tue, Aug 18, 2020 at 10:34 PM Matthias J. Sax  wrote:
>
> > Adding the user mailing list -- I think we should accepts votes on both
> > lists for this special case, as it's not a technical decision.
> >
> > @Boyang: as mentioned by Bruno, can we maybe add black/white options for
> > both proposals, too?
> >
> > I also agree that Design B is not ideal with regard to the Kafka logo.
> > Would it be possible to change Design B accordingly?
> >
> > I am not a font expert, but the fonts in both design are different and I
> > am wondering if there is an official Apache Kafka font that we should
> > reuse to make sure that the logos align -- I would expect that both
> > logos (including "Apache Kafka" and "Kafka Streams" names) will be used
> > next to each other and it would look awkward if the font differs.
> >
> >
> > -Matthias
> >
> > On 8/18/20 11:28 AM, Navinder Brar wrote:
> > > Hi,
> > > Thanks for the KIP, really like the idea. I am +1(non-binding) on A
> > mainly because I felt like you have to tilt your head to realize the
> > otter's head in B.
> > > Regards,Navinder
> > >
> > > On Tuesday, 18 August, 2020, 11:44:20 pm IST, Guozhang Wang <
> > wangg...@gmail.com> wrote:
> > >
> > >  I'm leaning towards design B primarily because it reminds me of the
> > Firefox
> > > logo which I like a lot. But I also share Adam's concern that it should
> > > better not obscure the Kafka logo --- so if we can tweak a bit to fix
> it
> > my
> > > vote goes to B, otherwise A :)
> > >
> > >
> > > Guozhang
> > >
> > > On Tue, Aug 18, 2020 at 9:48 AM Bruno Cadonna 
> > wrote:
> > >
> > >> Thanks for the KIP!
> > >>
> > >> I am +1 (non-binding) for A.
> > >>
> > >> I would also like to hear opinions whether the logo should be
> colorized
> > >> or just black and white.
> > >>
> > >> Best,
> > >> Bruno
> > >>
> > >>
> > >> On 15.08.20 16:05, Adam Bellemare wrote:
> > >>> I prefer Design B, but given that I missed the discussion thread, I
> > think
> > >>> it would be bet

Re: MONGODB KAFKA CONNECT

2020-07-28 Thread Robin Moffatt

Hi Milka,

You've not mentioned in which direction you want to connect MongoDB with
Kafka, but either way - Kafka Connect is generally the best approach.

You can learn about Kafka Connect here:
https://docs.confluent.io/current/connect/ and see a talk about it here:
https://rmoff.dev/ljc-kafka-02

For MongoDB connectors see https://www.confluent.io/hub/#mongodb


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Mon, 27 Jul 2020 at 18:19, Milka Wafula  wrote:

> Hello,
>
> Please how may we connect mongoDB with kafka
>
> Thanks
>
>
> --
>
> ---
>
> Thank you,
> Best Regards,
>
> Milka Naliaka Wafula   | +230-597-42875 / +254 793044726
> *Youtube website : *
> https://www.youtube.com/channel/UC4fk41ulxea8x9HjxcUyb5w
>

Re: Problem in reading From JDBC SOURCE

2020-07-02 Thread Robin Moffatt

Check out this article where it covers decimal handling:
https://www.confluent.io/blog/kafka-connect-deep-dive-jdbc-source-connector/#bytes-decimals-numerics


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 2 Jul 2020 at 13:54, vishnu murali 
wrote:

> Hi Guys,
>
> I am having some problem while reading from MySQL using JDBC source and
> received like below
> Anyone know what is the reason and how to solve this ?
>
> "a": "Aote",
>
>   "b": "AmrU",
>
>   "c": "AceM",
>
>   "d": "Aote",
>
>
> Instead of
>
> "a": 0.002,
>
>   "b": 0.465,
>
>   "c": 0.545,
>
>   "d": 0.100
>
>
> It's my configuration
>
>
> {
>
> "name": "sample",
>
> "config": {
>
> "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
>
> "connection.url": "jdbc:mysql://localhost:3306/sample",
>
> "connection.user": "",
>
> "connection.password": "xxx",
>
> "topic.prefix": "dample-",
>
> "poll.interval.ms": 360,
>
> "table.whitelist": "sample",
>
> "schemas.enable": "false",
>
> "mode": "bulk",
>
> "value.converter.schemas.enable": "false",
>
> "numeric.mapping": "best_fit",
>
> "value.converter": "org.apache.kafka.connect.json.JsonConverter",
>
> "transforms": "createKey,extractInt",
>
> "transforms.createKey.type":
> "org.apache.kafka.connect.transforms.ValueToKey",
>
> "transforms.createKey.fields": "ID",
>
> "transforms.extractInt.type":
> "org.apache.kafka.connect.transforms.ExtractField$Key",
>
> "transforms.extractInt.field": "ID"
>
> }
>
> }
>

Re: Kafka - entity-default Vs entity-name

2020-06-12 Thread Robin Moffatt

If you're going to cross-post, it's useful to include where else you've
asked this before, so that people can see what else has been suggested and
answered to avoid duplicating their effort :)

https://stackoverflow.com/questions/62343035/kafka-entity-default-vs-entity-name


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 12 Jun 2020 at 13:13, Nag Y  wrote:

> I applied kafka-config to get the default settings for the brokers,
>
> kafka-configs --bootstrap-server localhost:9092 --entity-type brokers
> --entity-default --describe
>
> The command responded with the following response, without any *complete*
>  output.
>
> Default configs for brokers in the cluster are:
>
>
>1. *So, how to get the all the default settings across all brokers*
>2. *I understood from the command line what it means, didnt get the
>context completely. What is the real difference between entity-default
>,entity-name*
>
> From documentation:
>
> --entity-default   Default entity name for
>clients/users/brokers (applies to
>corresponding entity type in command
>line)
>
> --entity-name  Name of entity (topic name/client
>id/user principal name/broker id)
>

Re: handle multiple requests

2020-06-10 Thread Robin Moffatt

You might also find this resource useful in general:
https://www.confluent.io/whitepaper/comparing-confluent-platform-with-traditional-messaging-middleware/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


‪On Wed, 10 Jun 2020 at 16:00, ‫נתי אלמגור‬‎ 
wrote:‬

> hello
> i'm very new in Kafka , i have experience in Rabbit MQ
> i have connection Layer which publish to Rabbit queue request and worker
> app which have number of threads(for example 8 equal to CPU's number) that
> subscribe to Rabbit Queue and each request handled with one thread
>
> i can not  find  this solution in kafka  please help me
>

Re: How to manually start ingesting in kafka source connector ?

2020-05-28 Thread Robin Moffatt

You could look at
https://rmoff.net/2019/08/15/reset-kafka-connect-source-connector-offsets/ and
experiment with creating the connector elsewhere to see if you can pre-empt
the key value that Kafka Connect will use when writing the offsets, and so
do your list 2 - 1 - 3 instead


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 28 May 2020 at 10:12, Yu Watanabe  wrote:

> Robin
>
> Thank you for the reply.
>
> Any way to not automatically start after creating connector ?
>
> I am trying to find a way to change connector offset  as described in
> below link before starting connector ..
>
>
> https://www.confluent.jp/blog/kafka-connect-deep-dive-jdbc-source-connector/#starting-table-capture
>
> Steps I want to do will be
>
> 1. Create jdbc connector
> 2. Change connector offset
> 3. Start connector
>
> Thanks,
> Yu
>
> On Thu, May 28, 2020 at 6:01 PM Robin Moffatt  wrote:
> >
> > When you create the connector, it will start.
> >
> >
> > --
> >
> > Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
> >
> >
> > On Thu, 28 May 2020 at 04:12, Yu Watanabe  wrote:
> >
> > > Dear community .
> > >
> > > I would like to ask question related to source connector in kafka
> > > connect (2.4.0) .
> > >
> > > Is there a way to manually start source connector after registering to
> > > kafka connect ?
> > >
> > > Looking at the document , I found PAUSE API ,
> > >
> > >
> > >
> https://docs.confluent.io/current/connect/references/restapi.html#put--connectors-(string-name)-pause
> > >
> > > however, could not find set initial state for individual tasks in
> > > connector properties ..
> > >
> > > https://docs.confluent.io/current/connect/managing/configuring.html
> > >
> > > I appreciate if I could get some help.
> > >
> > > Best Regards,
> > > Yu Watanabe
> > >
> > > --
> > > Yu Watanabe
> > >
> > > linkedin: www.linkedin.com/in/yuwatanabe1/
> > > twitter:   twitter.com/yuwtennis
> > >
>
>
>
> --
> Yu Watanabe
>
> linkedin: www.linkedin.com/in/yuwatanabe1/
> twitter:   twitter.com/yuwtennis
>

Re: How to manually start ingesting in kafka source connector ?

2020-05-28 Thread Robin Moffatt

When you create the connector, it will start.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 28 May 2020 at 04:12, Yu Watanabe  wrote:

> Dear community .
>
> I would like to ask question related to source connector in kafka
> connect (2.4.0) .
>
> Is there a way to manually start source connector after registering to
> kafka connect ?
>
> Looking at the document , I found PAUSE API ,
>
>
> https://docs.confluent.io/current/connect/references/restapi.html#put--connectors-(string-name)-pause
>
> however, could not find set initial state for individual tasks in
> connector properties ..
>
> https://docs.confluent.io/current/connect/managing/configuring.html
>
> I appreciate if I could get some help.
>
> Best Regards,
> Yu Watanabe
>
> --
> Yu Watanabe
>
> linkedin: www.linkedin.com/in/yuwatanabe1/
> twitter:   twitter.com/yuwtennis
>

Re: Kafka Connect Connector Tasks Uneven Division

2020-05-26 Thread Robin Moffatt

The KIP for the current rebalancing protocol is probably a good reference:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-415:+Incremental+Cooperative+Rebalancing+in+Kafka+Connect


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 26 May 2020 at 14:25, Deepak Raghav 
wrote:

> Hi Robin
>
> Thanks for the clarification.
>
> As you suggested, that task allocation between the workers is
> nondeterministic. I have shared the same information within in my team but
> there are some other parties, with whom I need to share this information as
> explanation for the issue raised by them and I cannot show this mail as a
> reference.
>
> It would be very great if you please share any link/discussion reference
> regarding the same.
>
> Regards and Thanks
> Deepak Raghav
>
>
>
> On Thu, May 21, 2020 at 2:12 PM Robin Moffatt  wrote:
>
> > I don't think you're right to assert that this is "expected behaviour":
> >
> > >  the tasks are divided in below pattern when they are first time
> > registered
> >
> > Kafka Connect task allocation is non-determanistic.
> >
> > I'm still not clear if you're solving for a theoretical problem or an
> > actual one. If this is an actual problem that you're encountering and
> need
> > a solution to then since the task allocation is not deterministic it
> sounds
> > like you need to deploy separate worker clusters based on the workload
> > patterns that you are seeing and machine resources available.
> >
> >
> > --
> >
> > Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
> >
> >
> > On Wed, 20 May 2020 at 21:29, Deepak Raghav 
> > wrote:
> >
> > > Hi Robin
> > >
> > > I had gone though the link you provided, It is not helpful in my case.
> > > Apart from this, *I am not getting why the tasks are divided in *below
> > > pattern* when they are *first time registered*, which is expected
> > behavior.
> > > I*s there any parameter which we can pass in worker property file which
> > > handle the task assignment strategy like we have range assigner or
> round
> > > robin in consumer-group ?
> > >
> > > connector rest status api result after first registration :
> > >
> > > {
> > >   "name": "REGION_CODE_UPPER-Cdb_Dchchargeableevent",
> > >   "connector": {
> > > "state": "RUNNING",
> > > "worker_id": "10.0.0.5:*8080*"
> > >   },
> > >   "tasks": [
> > > {
> > >   "id": 0,
> > >   "state": "RUNNING",
> > >   "worker_id": "10.0.0.4:*8078*"
> > > },
> > > {
> > >   "id": 1,
> > >   "state": "RUNNING",
> > >   "worker_id": "10.0.0.5:*8080*"
> > > }
> > >   ],
> > >   "type": "sink"
> > > }
> > >
> > > and
> > >
> > > {
> > >   "name": "REGION_CODE_UPPER-Cdb_Neatransaction",
> > >   "connector": {
> > > "state": "RUNNING",
> > > "worker_id": "10.0.0.4:*8078*"
> > >   },
> > >   "tasks": [
> > > {
> > >   "id": 0,
> > >   "state": "RUNNING",
> > >   "worker_id": "10.0.0.4:*8078*"
> > > },
> > > {
> > >   "id": 1,
> > >   "state": "RUNNING",
> > >   "worker_id": "10.0.0.5:*8080*"
> > > }
> > >   ],
> > >   "type": "sink"
> > > }
> > >
> > >
> > > But when I stop the second worker process and wait for
> > > scheduled.rebalance.max.delay.ms time i.e 5 min to over, and start the
> > > process again. Result is different.
> > >
> > > {
> > >   "name": "REGION_CODE_UPPER-Cdb_Dchchargeableevent",
> > >   "connector": {
> > > "state": "RUNNING",
> > > "worker_id": "10.0.0.5:*8080*"
> > >   },
> > >   "tasks": [
> > > {
> > >   "id": 0,
> > >   "state": "RUNNING",
> &g

Re: User ID / Password based authentication for Kafka Connect Source connector REST Interface

2020-05-22 Thread Robin Moffatt

See here:
https://docs.confluent.io/current/security/basic-auth.html#basic-auth-kconnect


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 22 May 2020 at 15:42,  wrote:

> Hi Team,
>
> I have configured a Kafka Connect source connector on Distributed worker
> and I am using the REST Interface for administering it like for
> starring/stopping/querying etc [curl https://host:port/connectors , curl
> https://host:port/connectors/https://host:port/connectors/%3cname>>
> etc].
>
> I need to configure a user-id/password based authentication for this so
> that users must give the user-id & password to issue curl commands. Could
> you please help on this on how to achieve this.
>
> Regards
> Ashish Sood
> +44 (0) 7440258835
>
> This e-mail and any files transmitted with it are for the sole use of the
> intended recipient(s) and may contain confidential and privileged
> information. If you are not the intended recipient(s), please reply to the
> sender and destroy all copies of the original message. Any unauthorized
> review, use, disclosure, dissemination, forwarding, printing or copying of
> this email, and/or any action taken in reliance on the contents of this
> e-mail is strictly prohibited and may be unlawful. Where permitted by
> applicable law, this e-mail and other e-mail communications sent to and
> from Cognizant e-mail addresses may be monitored. This e-mail and any files
> transmitted with it are for the sole use of the intended recipient(s) and
> may contain confidential and privileged information. If you are not the
> intended recipient(s), please reply to the sender and destroy all copies of
> the original message. Any unauthorized review, use, disclosure,
> dissemination, forwarding, printing or copying of this email, and/or any
> action taken in reliance on the contents of this e-mail is strictly
> prohibited and may be unlawful. Where permitted by applicable law, this
> e-mail and other e-mail communications sent to and from Cognizant e-mail
> addresses may be monitored.
>

Re: Kafka Connect Connector Tasks Uneven Division

2020-05-21 Thread Robin Moffatt

I don't think you're right to assert that this is "expected behaviour":

>  the tasks are divided in below pattern when they are first time
registered

Kafka Connect task allocation is non-determanistic.

I'm still not clear if you're solving for a theoretical problem or an
actual one. If this is an actual problem that you're encountering and need
a solution to then since the task allocation is not deterministic it sounds
like you need to deploy separate worker clusters based on the workload
patterns that you are seeing and machine resources available.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 20 May 2020 at 21:29, Deepak Raghav 
wrote:

> Hi Robin
>
> I had gone though the link you provided, It is not helpful in my case.
> Apart from this, *I am not getting why the tasks are divided in *below
> pattern* when they are *first time registered*, which is expected behavior.
> I*s there any parameter which we can pass in worker property file which
> handle the task assignment strategy like we have range assigner or round
> robin in consumer-group ?
>
> connector rest status api result after first registration :
>
> {
>   "name": "REGION_CODE_UPPER-Cdb_Dchchargeableevent",
>   "connector": {
> "state": "RUNNING",
> "worker_id": "10.0.0.5:*8080*"
>   },
>   "tasks": [
> {
>   "id": 0,
>   "state": "RUNNING",
>   "worker_id": "10.0.0.4:*8078*"
> },
> {
>   "id": 1,
>   "state": "RUNNING",
>   "worker_id": "10.0.0.5:*8080*"
> }
>   ],
>   "type": "sink"
> }
>
> and
>
> {
>   "name": "REGION_CODE_UPPER-Cdb_Neatransaction",
>   "connector": {
> "state": "RUNNING",
> "worker_id": "10.0.0.4:*8078*"
>   },
>   "tasks": [
> {
>   "id": 0,
>   "state": "RUNNING",
>   "worker_id": "10.0.0.4:*8078*"
> },
> {
>   "id": 1,
>   "state": "RUNNING",
>   "worker_id": "10.0.0.5:*8080*"
> }
>   ],
>   "type": "sink"
> }
>
>
> But when I stop the second worker process and wait for
> scheduled.rebalance.max.delay.ms time i.e 5 min to over, and start the
> process again. Result is different.
>
> {
>   "name": "REGION_CODE_UPPER-Cdb_Dchchargeableevent",
>   "connector": {
> "state": "RUNNING",
> "worker_id": "10.0.0.5:*8080*"
>   },
>   "tasks": [
> {
>   "id": 0,
>   "state": "RUNNING",
>   "worker_id": "10.0.0.5:*8080*"
> },
> {
>   "id": 1,
>   "state": "RUNNING",
>   "worker_id": "10.0.0.5:*8080*"
> }
>   ],
>   "type": "sink"
> }
>
> and
>
> {
>   "name": "REGION_CODE_UPPER-Cdb_Neatransaction",
>   "connector": {
> "state": "RUNNING",
> "worker_id": "10.0.0.4:*8078*"
>   },
>   "tasks": [
> {
>   "id": 0,
>   "state": "RUNNING",
>   "worker_id": "10.0.0.4:*8078*"
> },
> {
>   "id": 1,
>   "state": "RUNNING",
>   "worker_id": "10.0.0.4:*8078*"
> }
>   ],
>   "type": "sink"
> }
>
> Regards and Thanks
> Deepak Raghav
>
>
>
> On Wed, May 20, 2020 at 9:29 PM Robin Moffatt  wrote:
>
> > Thanks for the clarification. If this is an actual problem that you're
> > encountering and need a solution to then since the task allocation is not
> > deterministic it sounds like you need to deploy separate worker clusters
> > based on the workload patterns that you are seeing and machine resources
> > available.
> >
> >
> > --
> >
> > Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
> >
> >
> > On Wed, 20 May 2020 at 16:39, Deepak Raghav 
> > wrote:
> >
> > > Hi Robin
> > >
> > > Replying to your query i.e
> > >
> > > One thing I'd ask at this point is though if it makes any difference
>

Re: Kafka Connect Connector Tasks Uneven Division

2020-05-20 Thread Robin Moffatt

Thanks for the clarification. If this is an actual problem that you're
encountering and need a solution to then since the task allocation is not
deterministic it sounds like you need to deploy separate worker clusters
based on the workload patterns that you are seeing and machine resources
available.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 20 May 2020 at 16:39, Deepak Raghav 
wrote:

> Hi Robin
>
> Replying to your query i.e
>
> One thing I'd ask at this point is though if it makes any difference where
> the tasks execute?
>
> It actually makes difference to us, we have 16 connectors and as I stated
> tasks division earlier, first 8 connector' task are assigned to first
> worker process and another connector's task to another worker process and
> just to mention that these 16 connectors are sink connectors. Each sink
> connector consumes message from different topic.There may be a case when
> messages are coming only for first 8 connector's topic and because all the
> tasks of these connectors are assigned to First Worker, load would be high
> on it and another set of connectors in another worker would be idle.
>
> Instead, if the task would have been divided evenly then this case would
> have been avoided. Because tasks of each connector would be present in both
> workers process like below :
>
> *W1*   *W2*
>  C1T1C1T2
>  C2T2C2T2
>
> I hope, I gave your answer,
>
>
> Regards and Thanks
> Deepak Raghav
>
>
>
> On Wed, May 20, 2020 at 4:42 PM Robin Moffatt  wrote:
>
> > OK, I understand better now.
> >
> > You can read more about the guts of the rebalancing protocol that Kafka
> > Connect uses as of Apache Kafka 2.3 an onwards here:
> >
> https://www.confluent.io/blog/incremental-cooperative-rebalancing-in-kafka/
> >
> > One thing I'd ask at this point is though if it makes any difference
> where
> > the tasks execute? The point of a cluster is that Kafka Connect manages
> the
> > workload allocation. If you need workload separation and
> > guaranteed execution locality I would suggest separate Kafka Connect
> > distributed clusters.
> >
> >
> > --
> >
> > Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
> >
> >
> > On Wed, 20 May 2020 at 10:24, Deepak Raghav 
> > wrote:
> >
> > > Hi Robin
> > >
> > > Thanks for your reply.
> > >
> > > We are having two worker on different IP. The example which I gave you
> it
> > > was just a example. We are using kafka version 2.3.1.
> > >
> > > Let me tell you again with a simple example.
> > >
> > > Suppose, we have two EC2 node, N1 and N2 having worker process W1 and
> W2
> > > running in distribute mode with groupId i.e in same cluster and two
> > > connectors with having two task each i.e
> > >
> > > Node N1: W1 is running
> > > Node N2 : W2 is running
> > >
> > > First Connector (C1) : Task1 with id : C1T1 and task 2 with id : C1T2
> > > Second Connector (C2) : Task1 with id : C2T1 and task 2 with id : C2T2
> > >
> > > Now Suppose If both W1 and W2 worker process are running  and I
> register
> > > Connector C1 and C2 one after another i.e sequentially, on any of the
> > > worker process, the tasks division between the worker
> > > node are happening like below, which is expected.
> > >
> > > *W1*   *W2*
> > > C1T1C1T2
> > > C2T2C2T2
> > >
> > > Now, suppose I stop one worker process e.g W2 and start after some
> time,
> > > the tasks division is changed like below i.e first connector's task
> move
> > to
> > > W1 and second connector's task move to W2
> > >
> > > *W1*   *W2*
> > > C1T1C2T1
> > > C1T2C2T2
> > >
> > >
> > > Please let me know, If it is understandable or not.
> > >
> > > Note : Actually, In production, we are gonna have 16 connectors having
> 10
> > > task each and two worker node. With above scenario, first 8
> connectors's
> > > task move to W1 and next 8 connector' task move to W2, Which is not
> > > expected.
> > >
> > >
> > > Regards and Thanks
> > > Deepak Raghav
> > >
> > >
> > >
> > > On Wed, May 20, 2020 at 1:

Re: Kafka Connect Connector Tasks Uneven Division

2020-05-20 Thread Robin Moffatt

OK, I understand better now.

You can read more about the guts of the rebalancing protocol that Kafka
Connect uses as of Apache Kafka 2.3 an onwards here:
https://www.confluent.io/blog/incremental-cooperative-rebalancing-in-kafka/

One thing I'd ask at this point is though if it makes any difference where
the tasks execute? The point of a cluster is that Kafka Connect manages the
workload allocation. If you need workload separation and
guaranteed execution locality I would suggest separate Kafka Connect
distributed clusters.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 20 May 2020 at 10:24, Deepak Raghav 
wrote:

> Hi Robin
>
> Thanks for your reply.
>
> We are having two worker on different IP. The example which I gave you it
> was just a example. We are using kafka version 2.3.1.
>
> Let me tell you again with a simple example.
>
> Suppose, we have two EC2 node, N1 and N2 having worker process W1 and W2
> running in distribute mode with groupId i.e in same cluster and two
> connectors with having two task each i.e
>
> Node N1: W1 is running
> Node N2 : W2 is running
>
> First Connector (C1) : Task1 with id : C1T1 and task 2 with id : C1T2
> Second Connector (C2) : Task1 with id : C2T1 and task 2 with id : C2T2
>
> Now Suppose If both W1 and W2 worker process are running  and I register
> Connector C1 and C2 one after another i.e sequentially, on any of the
> worker process, the tasks division between the worker
> node are happening like below, which is expected.
>
> *W1*   *W2*
> C1T1C1T2
> C2T2C2T2
>
> Now, suppose I stop one worker process e.g W2 and start after some time,
> the tasks division is changed like below i.e first connector's task move to
> W1 and second connector's task move to W2
>
> *W1*   *W2*
> C1T1C2T1
> C1T2C2T2
>
>
> Please let me know, If it is understandable or not.
>
> Note : Actually, In production, we are gonna have 16 connectors having 10
> task each and two worker node. With above scenario, first 8 connectors's
> task move to W1 and next 8 connector' task move to W2, Which is not
> expected.
>
>
> Regards and Thanks
> Deepak Raghav
>
>
>
> On Wed, May 20, 2020 at 1:41 PM Robin Moffatt  wrote:
>
> > So you're running two workers on the same machine (10.0.0.4), is
> > that correct? Normally you'd run one worker per machine unless there was
> a
> > particular reason otherwise.
> > What version of Apache Kafka are you using?
> > I'm not clear from your question if the distribution of tasks is
> > presenting a problem to you (if so please describe why), or if you're
> just
> > interested in the theory behind the rebalancing protocol?
> >
> >
> > --
> >
> > Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
> >
> >
> > On Wed, 20 May 2020 at 06:46, Deepak Raghav 
> > wrote:
> >
> > > Hi
> > >
> > > Please, can anybody help me with this?
> > >
> > > Regards and Thanks
> > > Deepak Raghav
> > >
> > >
> > >
> > > On Tue, May 19, 2020 at 1:37 PM Deepak Raghav <
> deepakragha...@gmail.com>
> > > wrote:
> > >
> > > > Hi Team
> > > >
> > > > We have two worker node in a cluster and 2 connector with having 10
> > tasks
> > > > each.
> > > >
> > > > Now, suppose if we have two kafka connect process W1(Port 8080) and
> > > > W2(Port 8078) started already in distribute mode and then register
> the
> > > > connectors, task of one connector i.e 10 tasks are divided equally
> > > between
> > > > two worker i.e first task of A connector to W1 worker node and sec
> task
> > > of
> > > > A connector to W2 worker node, similarly for first task of B
> connector,
> > > > will go to W1 node and sec task of B connector go to W2 node.
> > > >
> > > > e.g
> > > > *#First Connector : *
> > > > {
> > > >   "name": "REGION_CODE_UPPER-Cdb_Dchchargeableevent",
> > > >   "connector": {
> > > > "state": "RUNNING",
> > > > "worker_id": "10.0.0.4:*8080*"
> > > >   },
> > > >   "tasks": [
> > > > {
> > > >   "id": 0,
> > > >   "state": "RUNNING",
> > &g

Re: Kafka Connect Connector Tasks Uneven Division

2020-05-20 Thread Robin Moffatt

So you're running two workers on the same machine (10.0.0.4), is
that correct? Normally you'd run one worker per machine unless there was a
particular reason otherwise.
What version of Apache Kafka are you using?
I'm not clear from your question if the distribution of tasks is
presenting a problem to you (if so please describe why), or if you're just
interested in the theory behind the rebalancing protocol?


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 20 May 2020 at 06:46, Deepak Raghav 
wrote:

> Hi
>
> Please, can anybody help me with this?
>
> Regards and Thanks
> Deepak Raghav
>
>
>
> On Tue, May 19, 2020 at 1:37 PM Deepak Raghav 
> wrote:
>
> > Hi Team
> >
> > We have two worker node in a cluster and 2 connector with having 10 tasks
> > each.
> >
> > Now, suppose if we have two kafka connect process W1(Port 8080) and
> > W2(Port 8078) started already in distribute mode and then register the
> > connectors, task of one connector i.e 10 tasks are divided equally
> between
> > two worker i.e first task of A connector to W1 worker node and sec task
> of
> > A connector to W2 worker node, similarly for first task of B connector,
> > will go to W1 node and sec task of B connector go to W2 node.
> >
> > e.g
> > *#First Connector : *
> > {
> >   "name": "REGION_CODE_UPPER-Cdb_Dchchargeableevent",
> >   "connector": {
> > "state": "RUNNING",
> > "worker_id": "10.0.0.4:*8080*"
> >   },
> >   "tasks": [
> > {
> >   "id": 0,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:*8078*"
> > },
> > {
> >   "id": 1,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8080"
> > },
> > {
> >   "id": 2,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8078"
> > },
> > {
> >   "id": 3,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8080"
> > },
> > {
> >   "id": 4,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8078"
> > },
> > {
> >   "id": 5,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8080"
> > },
> > {
> >   "id": 6,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8078"
> > },
> > {
> >   "id": 7,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8080"
> > },
> > {
> >   "id": 8,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8078"
> > },
> > {
> >   "id": 9,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8080"
> > }
> >   ],
> >   "type": "sink"
> > }
> >
> >
> > *#Sec connector*
> >
> > {
> >   "name": "REGION_CODE_UPPER-Cdb_Neatransaction",
> >   "connector": {
> > "state": "RUNNING",
> > "worker_id": "10.0.0.4:8078"
> >   },
> >   "tasks": [
> > {
> >   "id": 0,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8078"
> > },
> > {
> >   "id": 1,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8080"
> > },
> > {
> >   "id": 2,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8078"
> > },
> > {
> >   "id": 3,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8080"
> > },
> > {
> >   "id": 4,
> >   "state": "RUNNING",
> >   "worker_id": "10.0.0.4:8078"
>

Re: Exception in SFTP CSV SOURCE

2020-05-19 Thread Robin Moffatt

Hi Vishnu,

I think there is a problem with your email client, it's just sent a
duplicate of each of your emails from yesterday?

thanks, Robin.

On Tue, 19 May 2020 at 16:44, vishnu murali 
wrote:

> Hi Guys
>
> By Trying SFTP CSV SOURCE i am getting this exception by using this
> configuration.
>
>
> what is the issue and how to resolve it?
>
> can anyone know?
>
> *Config:*
> {
> "name": "CsvSFTP1",
> "config": {
> "tasks.max": "1",
> "connector.class":
> "io.confluent.connect.sftp.SftpCsvSourceConnector",
> "cleanup.policy": "NONE",
> "behavior.on.error": "IGNORE",
> "input.path": "/mnt/c/Users/vmuralidharan/Desktop",
> "error.path": "/mnt/c/Users/vmuralidharan/Desktop/error",
> "finished.path": "/mnt/c/Users/vmuralidharan/Desktop/finished",
> "input.file.pattern": "orm.csv",
> "kafka.topic": "sftp-testing-topic",
> "csv.first.row.as.header": "true",
> "schema.generation.enabled": "false"
> }
> }
>
> *Exception StackTrace:*
> org.apache.kafka.connect.errors.ConnectException: Can not get connection to
> SFTP:
> at
>
> io.confluent.connect.sftp.connection.SftpConnection.init(SftpConnection.java:115)
> at
>
> io.confluent.connect.sftp.connection.SftpConnection.(SftpConnection.java:66)
> at
>
> io.confluent.connect.sftp.source.AbstractSftpSourceConnectorConfig.getSftpConnection(AbstractSftpSourceConnectorConfig.java:557)
> at
>
> io.confluent.connect.sftp.source.AbstractSftpSourceConnectorConfig.(AbstractSftpSourceConnectorConfig.java:242)
> at
>
> io.confluent.connect.sftp.source.SftpSourceConnectorConfig.(SftpSourceConnectorConfig.java:101)
> at
>
> io.confluent.connect.sftp.source.SftpCsvSourceConnectorConfig.(SftpCsvSourceConnectorConfig.java:157)
> at
>
> io.confluent.connect.sftp.SftpCsvSourceConnector.start(SftpCsvSourceConnector.java:47)
> at
>
> org.apache.kafka.connect.runtime.WorkerConnector.doStart(WorkerConnector.java:111)
> at
>
> org.apache.kafka.connect.runtime.WorkerConnector.start(WorkerConnector.java:136)
> at
>
> org.apache.kafka.connect.runtime.WorkerConnector.transitionTo(WorkerConnector.java:196)
> at
> org.apache.kafka.connect.runtime.Worker.startConnector(Worker.java:260)
> at
>
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startConnector(DistributedHerder.java:1183)
> at
>
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1400(DistributedHerder.java:125)
> at
>
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:1199)
> at
>
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:1195)
> at
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
>
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
>
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: com.jcraft.jsch.JSchException: java.net.ConnectException:
> Connection refused (Connection refused)
> at com.jcraft.jsch.Util.createSocket(Util.java:394)
> at com.jcraft.jsch.Session.connect(Session.java:215)
> at com.jcraft.jsch.Session.connect(Session.java:183)
> at
>
> io.confluent.connect.sftp.connection.SftpConnection.init(SftpConnection.java:109)
> ... 18 more
> Caused by: java.net.ConnectException: Connection refused (Connection
> refused)
> at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
> at
> java.base/java.net
> .AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
> at
> java.base/java.net
> .AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
> at
> java.base/java.net
> .AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224)
> at
> java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
> at java.base/java.net.Socket.connect(Socket.java:609)
> at java.base/java.net.Socket.connect(Socket.java:558)
> at java.base/java.net.Socket.(Socket.java:454)
> at java.base/java.net.Socket.(Socket.java:231)
> at com.jcraft.jsch.Util$1.run(Util.java:362)
>

Re: Persist Kafka Topics and ksqldb

2020-05-19 Thread Robin Moffatt

You need to externalise your container data stores. Here's an
example Docker Compose that does that:
https://github.com/confluentinc/demo-scene/blob/master/wifi-fun/docker-compose.yml
<https://github.com/confluentinc/demo-scene/blob/master/wifi-fun/docker-compose.yml#L10>


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 19 May 2020 at 15:55, Mohammed Ait Haddou <
mohammedaithad...@gmail.com> wrote:

> After a *docker-compose restart. *All topics, ksqldb types are lost.
> Is there any way to safely persist all data ?
> docker-compse :
> ---
> version: "2"
> services:
>   zookeeper:
> image: confluentinc/cp-zookeeper:latest
> container_name: zookeeper
> environment:
>   ZOOKEEPER_CLIENT_PORT: 2181
>   ZOOKEEPER_TICK_TIME: 2000
>
>   kafka:
> image: confluentinc/cp-enterprise-kafka:latest
> container_name: kafka
> depends_on:
>   - zookeeper
> links:
>   - zookeeper
> ports:
>
> # "`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-
>
> # An important note about accessing Kafka from clients on other machines:
>
> # ---
>   #
>
> # The config used here exposes port 9092 for _external_ connections to
> the broker
>
> # i.e. those from _outside_ the docker network. This could be from the
> host machine
>
> # running docker, or maybe further afield if you've got a more
> complicated setup.
>
> # If the latter is true, you will need to change the value 'localhost' in
>
> # KAFKA_ADVERTISED_LISTENERS to one that is resolvable to the docker
> host from those
>   # remote clients
>   #
>
> # For connections _internal_ to the docker network, such as from other
> services
>   # and components, use kafka:29092.
>   #
>   # See https://rmoff.net/2018/08/02/kafka-listeners-explained/
>  for details
>
> # "`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-
>   #
>   - 9092:9092
>   - "29092:29092"
> environment:
>   KAFKA_BROKER_ID: 1
>   KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
>   KAFKA_LISTENER_SECURITY_PROTOCOL_MAP:
> PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
>   KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
>   KAFKA_ADVERTISED_LISTENERS:
> PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
>   KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
>   KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
>   KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
>   KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
>
> # -v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v-v
>
> # Useful settings for development/laptop use - modify as needed for Prod
>
> # This one makes ksqlDB feel a bit more responsive when queries start
> running
>   KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 100
> command:
>   - bash
>   - -c
>   - |
> echo '127.0.0.1 kafka' >> /etc/hosts
> /etc/confluent/docker/run
> sleep infinity
>
>   schema-registry:
> image: confluentinc/cp-schema-registry:5.5.0
> container_name: schema-registry
> depends_on:
>   - zookeeper
>   - kafka
> links:
>   - zookeeper
>   - kafka
> ports:
>   - 8081:8081
> environment:
>   SCHEMA_REGISTRY_HOST_NAME: schema-registry
>   SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: zookeeper:2181
>
>   cassandra:
> image: cassandra:latest
> container_name: cassandra
> ports:
>   - 7000:7000
>
>   kafka-connect-01:
> image: confluentinc/cp-kafka-connect:5.5.0
> container_name: kafka-connect-01
> depends_on:
>   - kafka
>   - schema-registry
>   - cassandra
> links:
>   - schema-registry
>   - kafka
>   - mysql
>   - cassandra
> ports:
>   - 8083:8083
> environment:
>   CONNECT_BOOTSTRAP_SERVERS: "kafka:29092"
>   CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect-01"
>   CONNECT_REST_PORT: 8083
>   CONNECT_GROUP_ID: kafka-connect-01
>   CONNECT_CONFIG_STORAGE_TOPIC: _kafka-connect-01-configs
>   CONNECT_OFFSET_STORAGE_TOPIC: _kafka-connect-01-offsets
>   CONNECT_STATUS_STORAGE_TOPIC: _kafka-connect-01-status
>   CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
>   CONNECT_KEY_

Re: Cannot access to kafka by server domain and port.

2020-05-19 Thread Robin Moffatt

This should help your understanding:
https://rmoff.net/2018/08/02/kafka-listeners-explained/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 19 May 2020 at 11:48, 深大李展发  wrote:

> Thank you, Ixy, thanks for your reply, it's working now!
> I will only need to try to set it up in Docker. May I ask why this is
> different? I had also tried "OUTSIDE://:9092" before, is there any
> difference between these config? 
> I am always confused by this config, I thought it only means Kafka will
> bind to 9092 port, and accept all connections. 
>
>
> -- 原始邮件 --
> 发件人: "lxy" 发送时间: 2020年5月19日(星期二) 晚上6:15
> 收件人: "users"
> 主题: Re:Cannot access to kafka by server domain and port.
>
>
>
>
>
>
> It seems that your OUTSIDE listener is wrong. "OUTSIDE://localhost:9092"
> means 127.0.0.1:9092. Try "OUTSIDE://0.0.0.0:9092"
>
>
>
>
>
>
>
>
>
>
> At 2020-05-19 17:57:41, "深大李展发" 
> Hi, I have been struggling for this connection problem for a whole week.
> I run Kafka  on my server machine which is on Cloud. And I cannot
> manage to connect to Kafka in anyway.
> It always print out `Connection to node -1
> (myServerDomain.ltd/myServerIp:9092) could not be established. Broker may
> not be available.`
>
>
>
>
> Here is what I had done to keep it simple so I can find out why:
> - I stopped using Docker.
> - I configure the Cloud firewall, so it will not block port 9092.
> - I start Kafka standalone(1 broker).
> - I start zookeeper standalone(1 node).
> - Zookeeper and Kafka use JAAS to connect.
> - I configure Kafka to log in TRACE level.
> - SASL and SSL is all turn off.
>
>
> I know advertised.listeners is usually the key of these situation, so here
> is what I had tried:
> SITUATION 1:
> - set advertised.listener to `localhost` on propose
> - use `kafka-topic --list --bootstrap-server localhost:9092`:
>     1. Kafka print out the metadata request
>     2. Kafka print out the returned metadata
>     3. Client connect to the advertised listener(which is
> localhost) successfully
>     4. Topics list is returned, client print out topic
> list. All well.
> - use `kafka-topic --list --bootstrap-server xxx.ltd:9092`
>     1. Kafka even not print out the first metadata request.
>     2. In client, it print out(Notice, it is node -1, not
> node 1): `Connection to node -1 (myServerDomain.ltd/myServerIp:9092) could
> not be established. Broker may not be available.`
>     3. Stop Kafka, start a WWW service on port 9092, can
> access the WWW service by port 9092.
>
>
> SITUATION 2:
> - set advertised.listener to `xxx.ltd`
> - use `kafka-topic --list --bootstrap-server localhost:9092`:
>     1. Kafka print out the metadata request
>     2. Kafka print out the returned metadata
>     3. Client try to connect to the advertised
> listener(which is xxx.ltd)
>     4. Failed, it print out(Notice,it is node 1, not node
> -1 like above, that means, client is try to connect by the metadata
> returned by kafka): `Connection to node 1
> (myServerDomain.ltd/myServerIp:9092) could not be established. Broker may
> not be available.`
> - use `kafka-topic --list --bootstrap-server xxx.ltd:9092`:
>     1. Kafka even not print out the first metadat request.
>     2. In client, it print out(Notice is node -1, not node
> 1): `Connection to node -1 (myServerDomain.ltd/myServerIp:9092) could not
> be established. Broker may not be available.`
>
>
> So, maybe, I think, maybe there is not any TCP request, all request to
> myServerDomain.ltd/myServerIp:9092 is blocked somehow. So I use `tcpdump -p
> 9092` to capture packets, here is what I get:
> -
> ...
> 2020-05-19 17:34:41.428139 IP 172.18.118.28.9092 > 61.140.182.143.5826:
> Flags [R.], seq 0, ack 4281665850, win 0, length 0
> 2020-05-19 17:34:41.842286 IP 61.140.182.143.5828 > 172.18.118.28.9092:
> Flags [S], seq 3141006320, win 64240, options [mss 1400,sackOK,TS val
> 1788286298 ecr 0,nop,wscale 1], length 0
> 2020-05-19 17:34:41.842360 IP 172.18.118.28.9092 > 61.140.182.143.5828:
> Flags [R.], seq 0, ack 3141006321, win 0, length 0
> 2020-05-19 17:34:42.657551 IP 61.140.182.143.5833 > 172.18.118.28.9092:
> Flags [S], seq 44626980, win 64240, options [mss 1400,sackOK,TS val
> 1788287114 ecr 0,nop,wscale 1], length 0
> 2020-05-19 17:34:42.657604 IP 172.18.118.28.9092 > 61.140.182.143.5833:
> Flags [R.], seq 0, ack 44626981, win 0, length
> ...
> 61.140.182.143 is my local laptop. It seems they were communicating.
>
>
> This is my `server.properties` config:
> --

Re: Problem in Docker

2020-05-15 Thread Robin Moffatt

Check this out: https://rmoff.dev/fix-jdbc-driver-video (specifically here
for Docker instructions https://www.youtube.com/watch?v=vI_L9irU9Pc&t=665s)


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 15 May 2020 at 08:16, vishnu murali 
wrote:

> Hi i am running cp-all-in one docker for the confluent kafka
>
> There i am trying that JDBCSourceConnector.
>
> it is showing this results..
> {
> "error_code": 400,
> "message":
> "Connector configuration is invalid and contains the following 2
> error(s):\nInvalid value java.sql.SQLException: No suitable driver
> found for jdbc:mysql://localhost:3306/sample for configuration
> Couldn't open connection to
> jdbc:mysql://localhost:3306/sample\nInvalid value
> java.sql.SQLException: No suitable driver found for
> jdbc:mysql://localhost:3306/sample for configuration Couldn't open
> connection to jdbc:mysql://localhost:3306/sample\nYou can also find
> the above list of errors at the endpoint
> `/connector-plugins/{connectorType}/config/validate`"
> }
>
>
> How can i add external jar file in that docker?
>
> Because i am new to docker and i am not even know it is possible or not!!!
>
> can anyone know thie solution?
>

Re: JDBC source connector

2020-05-14 Thread Robin Moffatt

If you just want it once then delete the connector once it's processed all
the data


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 14 May 2020 at 16:14, vishnu murali 
wrote:

> Thanks Liam
>
> But I am asking like assume  I am having 10.
>
> Using JDBC source I need to push that once..
>
> No more additional data will be added in future in that table.
>
> In that case i need to push that only once not more than one...
>
> For this scenario I am asking!!
>
> On Thu, May 14, 2020, 19:20 Liam Clarke-Hutchinson <
> liam.cla...@adscale.co.nz> wrote:
>
> > Why not use autoincrement? It'll only emit new records on subsequent
> polls
> > then.
> >
> > On Thu, 14 May 2020, 11:15 pm vishnu murali,  >
> > wrote:
> >
> > > Hi Guys,
> > >
> > > I am using the mode *bulk  *and poll.interval.ms *10* in the
> Source
> > > connector configuration.
> > >
> > > But I don't need to load data another time.?
> > >
> > > I need to load the data only once ??
> > >
> > > How can I able to do this ?
> > >
> >
>

Re: Jdbc Source Connector Config

2020-05-13 Thread Robin Moffatt

You don't have to use Single Message Transform (which is what these are) at
all if you don't want to.
However, they do serve a useful purpose where you want to modify data as it
passes through Kafka Connect.

Ref:
-
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-3/
-
https://www.confluent.io/blog/kafka-connect-deep-dive-jdbc-source-connector/
- http://rmoff.dev/ksldn19-kafka-connect


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 13 May 2020 at 10:34, vishnu murali 
wrote:

> Hi Guys,
>
> i am having a question.
>
>
> "transforms":"createKey,extractInt",
>
>
> "transforms.createKey.type":"org.apache.kafka.connect.transforms.ValueToKey",
>  "transforms.createKey.fields":"id",
>
>
> "transforms.extractInt.type":"org.apache.kafka.connect.transforms.ExtractField$Key",
>  "transforms.extractInt.field":"id"
>
> what is the need of these configuration in JdbcSourceConnector?
> And without these can we use SourceConnector?
>

Re: JDBC SINK SCHEMA

2020-05-11 Thread Robin Moffatt

Schema Registry is available as part of Confluent Platform download (
https://www.confluent.io/download/), and install per
https://docs.confluent.io/current/schema-registry/installation/index.html
The difference is that you just run the Schema Registry part of the stack,
and leave the other components as is. When you configure Schema Registry
you point it at your existing Apache Kafka cluster (
https://docs.confluent.io/current/schema-registry/installation/config.html#schemaregistry-config
)


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Mon, 11 May 2020 at 10:02, vishnu murali 
wrote:

> Hi Robin
>
> Is it possible to integrate Apache Kafka with that confluent schema
> registry like u said ??
>
> I don't know how to do,can u able to give any reference?
>
> On Mon, May 11, 2020, 14:09 Robin Moffatt  wrote:
>
> > You can use Apache Kafka as you are currently using, and just deploy
> Schema
> > Registry alongside it.
> >
> >
> > --
> >
> > Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
> >
> >
> > On Sat, 9 May 2020 at 02:16, Chris Toomey  wrote:
> >
> > > You have to either 1) use one of the Confluent serializers
> > > <
> > >
> >
> https://docs.confluent.io/current/schema-registry/serdes-develop/index.html#
> > > >
> > > when you publish to the topic, so that the schema (or reference to it)
> is
> > > included, or 2) write and use a custom converter
> > > <
> > >
> >
> https://kafka.apache.org/25/javadoc/org/apache/kafka/connect/storage/Converter.html
> > > >
> > > that knows about the data schema and can take the kafka record value
> and
> > > convert it into a kafka connect record (by implementing the
> toConnectData
> > > converter method), which is what the sink connectors are driven from.
> > >
> > > See https://docs.confluent.io/current/connect/concepts.html#converters
> > >
> > > Chris
> > >
> > >
> > >
> > > On Fri, May 8, 2020 at 6:59 AM vishnu murali <
> vishnumurali9...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hey Guys,
> > > >
> > > > I am *using Apache **2.5 *not confluent.
> > > >
> > > > i am trying to send data from topic to database using jdbc sink
> > > connector.
> > > >
> > > > we need to send that data with the appropriate schema also.
> > > >
> > > > i am *not using confluent version* of kafka.
> > > >
> > > > so can anyone explain how can i do this ?
> > > >
> > >
> >
>

Re: Write to database directly by referencing schema registry, no jdbc sink connector

2020-05-11 Thread Robin Moffatt

>  wirite  to target database. I want to use self-written  java code
instead of kafka jdbc sink connector.

Out of interest, why do you want to do this? Why not use the JDBC sink
connector (or a fork of it if you need to amend its functionality)?



-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Sat, 9 May 2020 at 03:38, wangl...@geekplus.com.cn <
wangl...@geekplus.com.cn> wrote:

>
> Using debezium to parse binlog, using avro serialization and send to kafka.
>
> Need to consume the avro serialized  binlog data and  wirite  to target
> database
> I want to use self-written  java code instead of kafka jdbc sink
> connector.
>
> How can i  reference the schema registry, convert a kafka message to
> corresponding table record and write to corresponding table?
> Is there any example code to do this ?
>
> Thanks,
> Lei
>
>
>
> wangl...@geekplus.com.cn
>
>

Re: JDBC SINK SCHEMA

2020-05-11 Thread Robin Moffatt

You can use Apache Kafka as you are currently using, and just deploy Schema
Registry alongside it.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Sat, 9 May 2020 at 02:16, Chris Toomey  wrote:

> You have to either 1) use one of the Confluent serializers
> <
> https://docs.confluent.io/current/schema-registry/serdes-develop/index.html#
> >
> when you publish to the topic, so that the schema (or reference to it) is
> included, or 2) write and use a custom converter
> <
> https://kafka.apache.org/25/javadoc/org/apache/kafka/connect/storage/Converter.html
> >
> that knows about the data schema and can take the kafka record value and
> convert it into a kafka connect record (by implementing the toConnectData
> converter method), which is what the sink connectors are driven from.
>
> See https://docs.confluent.io/current/connect/concepts.html#converters
>
> Chris
>
>
>
> On Fri, May 8, 2020 at 6:59 AM vishnu murali 
> wrote:
>
> > Hey Guys,
> >
> > I am *using Apache **2.5 *not confluent.
> >
> > i am trying to send data from topic to database using jdbc sink
> connector.
> >
> > we need to send that data with the appropriate schema also.
> >
> > i am *not using confluent version* of kafka.
> >
> > so can anyone explain how can i do this ?
> >
>

Re: JDBC Sink Connector

2020-05-11 Thread Robin Moffatt

Schema Registry and its serde libraries are part of Confluent Platform,
licensed under Confluent Community Licence (
https://www.confluent.io/confluent-community-license-faq/)




-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 8 May 2020 at 13:39, vishnu murali 
wrote:

> Thank you so much Robin
>
> It helped me a lot to define sink connector with upsert mode and it is very
> helpful.
>
> For that schema related question i am not getting proper understanding.
>
> Because i am using Normal Apache kafka,i don't know whether those schema
> registry ,kql,avro serializers are present or not in Apache kafka (2.5)
>
> I Suppose these Schema Registry and ksql services  are coming in the
> confluent version of Kafka.
>
> On Thu, May 7, 2020 at 1:47 PM Robin Moffatt  wrote:
>
> > If you don't want to send the schema each time then serialise your data
> > using Avro (or Protobuf), and then the schema is held in the Schema
> > Registry. See https://www.youtube.com/watch?v=b-3qN_tlYR4&t=981s
> >
> > If you want to update a record insert of insert, you can use the upsert
> > mode. See https://www.youtube.com/watch?v=b-3qN_tlYR4&t=627s
> >
> >
> > --
> >
> > Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
> >
> >
> > On Thu, 7 May 2020 at 06:48, vishnu murali 
> > wrote:
> >
> > > Hey Guys,
> > >
> > > i am working on JDBC Sink Conneector to take data from kafka topic to
> > > mysql.
> > >
> > > i am having 2 questions.
> > >
> > > i am using normal Apache Kafka 2.5 not a confluent version.
> > >
> > > 1)For inserting data every time we need to add the schema data also
> with
> > > every data,How can i overcome this situation?i want to give only the
> > data.
> > >
> > > 2)In certain time i need to update the existing record without adding
> as
> > a
> > > new record.How can i achieve this?
> > >
> >
>

Re: JDBC Sink Connector

2020-05-07 Thread Robin Moffatt

If you don't want to send the schema each time then serialise your data
using Avro (or Protobuf), and then the schema is held in the Schema
Registry. See https://www.youtube.com/watch?v=b-3qN_tlYR4&t=981s

If you want to update a record insert of insert, you can use the upsert
mode. See https://www.youtube.com/watch?v=b-3qN_tlYR4&t=627s

-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

On Thu, 7 May 2020 at 06:48, vishnu murali 
wrote:

> Hey Guys,
>
> i am working on JDBC Sink Conneector to take data from kafka topic to
> mysql.
>
> i am having 2 questions.
>
> i am using normal Apache Kafka 2.5 not a confluent version.
>
> 1)For inserting data every time we need to add the schema data also with
> every data,How can i overcome this situation?i want to give only the data.
>
> 2)In certain time i need to update the existing record without adding as a
> new record.How can i achieve this?
>

Re: Connector For MirrorMaker

2020-05-01 Thread Robin Moffatt

Are you talking about MirrorMaker 2? That runs as a connector.

If not, perhaps you can clarify your question a bit as to what it is you
are looking for.

-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

On Fri, 1 May 2020 at 13:57, vishnu murali 
wrote:

> Hi Guys
>
> Previously I asked question about the Mirror maker and it is solved now.
>
> So Now I need to know is there any connectors available for that same.
>
> Like JdbcConnector acts as a source and sink for DB connection is there any
> connector available for performing mirror operations
>
>  or
>
> does some one having own created connectors for this purpose??
>

Re: JDBC source connector to Kafka topic

2020-04-29 Thread Robin Moffatt

How are you running Kafka? Do you mean when you shut it down you have to
reconfigure the connector?


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 29 Apr 2020 at 20:03, vishnu murali 
wrote:

> Hi guys
> I am trying that JDBC source connector to get data from MySQL and send as a
> data in a topic,so here what I am facing is there is more manual here
>
> After starting zookeeper,server, connect-distributed in Apache kafka
>
> I need to give Post request every time to the localhost:8083/connectors
> with the request body of config details when I need data and also all data
> will come again and again..
>
> Is there any way to achieve this CDC?
>

Re: Sybase Connection

2020-04-20 Thread Robin Moffatt

if Sybase supports JDBC then you can probably use the JDBC Source connector
for Kafka Connect

Ref:
https://www.confluent.io/blog/kafka-connect-deep-dive-jdbc-source-connector/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Mon, 20 Apr 2020 at 17:31, Jonatan Frank 
wrote:

> Hi,
>
>
>
> I  would like to know if Kafka can connect to a sybase Db in order to
> replicate data to another database.
>
>
>
> Thanks.
>
>
>
> [image: Frank-Saucedo]
>
>
>

Confluent Virtual User Group (VUG)

2020-04-17 Thread Robin Moffatt

We may not be able to go to meetups in person at the moment to learn and
chat with our friends in the community about Confluent Platform and Apache
Kafka, but Confluent Virtual User Group (VUG) is the next best thing 😀

Meetups are held online and across a variety of timezones so that everyone
gets a chance to join.

🗣️If you want to find out when the next ones are announced join here (
https://cnfl.io/confluent-vug) or check the #events channel on Confluent
Community Slack (http://cnfl.io/slack)

📆If you'd rather it directly to your inbox (well, calendar) there's an
iCal feed published here (
https://airtable.com/shrzKAKFMcfIgFVK9/iCal?timeZone=Europe%2FLondon&userLocale=en-gb)
or an online view here (https://cnfl.io/online-community-meetups)



-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

New blog post: A quick and dirty way to monitor data arriving on Kafka

2020-04-17 Thread Robin Moffatt

I thought I'd share here a short blog I wrote up recently:
https://rmoff.net/2020/04/16/a-quick-and-dirty-way-to-monitor-data-arriving-on-kafka/

Enjoy :)


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

Re: Adding new connector Jars dynamically in Kafka Connect

2020-04-13 Thread Robin Moffatt

No. You have you restart the worker.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 7 Apr 2020 at 18:24, nitin agarwal  wrote:

> Hi,
>
> I have a use case where new connectors will keep on adding to existing
> running Kafka Connect cluster. Is there any way in Kafka Connect to submit
> the new connector jar dynamically without restarting the Connect process?
>
> Thanks,
> Nitin
>

Re: How to disable OPTIONS method in confluent-control-center

2020-03-31 Thread Robin Moffatt

You probably want the Confluent Platform mailing list for this:
https://groups.google.com/forum/#!forum/confluent-platform (or Confluent
Platform slack group: http://cnfl.io/slack with the #control-center
channel). Or if you have a Confluent support contract, contact support :)

-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

On Tue, 31 Mar 2020 at 07:01, Sunil CHAUDHARI
 wrote:

> Hi,
> I don't know whether this question is relevant to this group?
> Sorry If I posted in wrong group.
> I want to disable OPTIONS method in Confluent-control center running on
> port 9091.
> Can someone guide me for required configurations?
>
> Regards,
> Sunil.
>

Re: I'm trying to connect to a kafka broker running in AWS EKS (from outside the EKS cluster).

2020-03-19 Thread Robin Moffatt

You need to make sure you've configured your listeners & advertised
listeners correctly. This should help:
https://rmoff.net/2018/08/02/kafka-listeners-explained/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 19 Mar 2020 at 01:49, Dan Hill  wrote:

> Problem: I'm hitting an error: "no such host" for "
> kafka-0.cluster.local:19092".
>
> Has anyone done this before?  Any help would be appreciated.  Thanks! - Dan
>
> My long-term goal is to get an AWS Lambda to send events to a Kafka running
> in AWS EKS.
>
> I used the following instructions
> <
> https://github.com/helm/charts/tree/master/incubator/kafka#connecting-to-kafka-from-outside-kubernetes
> >
> (linked to the "outside kubernetes" part) to setup up Kafka using the helm
> config.  The only modifications are for the "outside kubernetes" part.
> <
> https://github.com/helm/charts/tree/master/incubator/kafka#connecting-to-kafka-from-outside-kubernetes
> >
>
> I've tried a few variations.  None of them worked.  I still can't connect
> to it.
> - on an Lambda in the same subnet, on an EC2 machine in the same subnet, on
> a
> - with a couple different "outside kubernetes" options.
>
> E.g. if I setup external using LoadBalancer, I'll get something an External
> IP like (fake) afdsafsafsafdas-13412341.us-east-1.elb.amazon.com:19092.
>
> If I try to run a basic command against this domain, it fails saying there
> is no such host for kafka-0.cluster.local:19092.
>

Re: Connect - Membership Protocol

2020-03-18 Thread Robin Moffatt

Are the four brokers across two clusters (B1+B2) / (B3+B4), or one cluster?
If one cluster, are using the same config/offset/status topics for each
Connect cluster? Because that definitely won't work.

In case it's useful:
https://rmoff.net/2019/11/22/common-mistakes-made-when-configuring-multiple-kafka-connect-workers/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 17 Mar 2020 at 17:58, mandeep gandhi 
wrote:

> Gentle reminder.
>
> + users.
>
> On Mon, Mar 16, 2020 at 11:27 PM mandeep gandhi 
> wrote:
>
> > Hi,
> >
> > (PS - I did not use users list as this concerns some internals)
> >
> > I was trying to deploy the following config on a K8s cluster  -
> >
> > Namespace -  X1 Connect group id -  G1Bootstrap servers - B1, B2
> > Namespace -  X2 Connect group id -  G1 Bootstrap servers - B3, B4
> >
> > With this configuration, I am seeing multiple rebalances (logs below). So
> > the question is how is the Membership protocol working?
> > I have tried to follow some KIPs and they mostly say that the Bootstrap
> > server is the coordinator of the group. If yes, then logically this
> > configuration should work just fine as both configs have different
> > bootstrap servers.  But as far as I have tried to understand the code
> (and
> > tried to run Kafka Connect), it seems like the members get added in the
> > group one by one and the head of the list becomes the group
> co-ordinator. (
> > if I change Connect Group id in 2nd config, things work)
> >
> > Also, I wanted some pointers on what is happening at the server-side vs
> > client-side during the protocol.
> >
> >
> > Kindly help.
> >
> > Logs -
> >
> > [2020-03-16 10:27:13,386] INFO [Worker clientId=connect-1,
> groupId=streaming-connect] Current config state offset 12 does not match
> group assignment 9164. Forcing rebalance.
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:942)
> > [2020-03-16 10:27:13,386] INFO [Worker clientId=connect-1,
> groupId=streaming-connect] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:213)
> > [2020-03-16 10:27:13,386] INFO [Worker clientId=connect-1,
> groupId=streaming-connect] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:505)
> > [2020-03-16 10:27:13,388] INFO [Worker clientId=connect-1,
> groupId=streaming-connect] Successfully joined group with generation 18535
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:469)
> > [2020-03-16 10:27:13,389] INFO [Worker clientId=connect-1,
> groupId=streaming-connect] Joined group at generation 18535 and got
> assignment: Assignment{error=0,
> leader='connect-1-3580fdcc-257e-40e7-8243-f7839021599f', leaderUrl='
> http://10.13.191.32:8083/', offset=9164, connectorIds=[], taskIds=[],
> revokedConnectorIds=[], revokedTaskIds=[], delay=0}
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
> > [2020-03-16 10:27:13,389] WARN [Worker clientId=connect-1,
> groupId=streaming-connect] Catching up to assignment's config offset.
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:909)
> > [2020-03-16 10:27:13,389] INFO [Worker clientId=connect-1,
> groupId=streaming-connect] Current config state offset 12 is behind group
> assignment 9164, reading to end of config log
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:970)
> >
> >
> >
> > Thanks,
> >
> > Mandeep Gandhi
> >
> >
> >
> >
>

Re: Some questions about copyright

2020-02-24 Thread Robin Moffatt

Hi Michael,

You can find the ASF trademark policies here:
https://www.apache.org/foundation/marks/#books


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Sun, 23 Feb 2020 at 19:47, Michi Bertschi 
wrote:

> Hello
> I'm Michael
>
>
> I am writing a thesis for the evaluation of Kafka the right one for our new
> project. I wanted to ask if I can use the Kafkalogo in my diploma thesis?
> If so, which regulations do you have what it should look like? I look
> forward to your answer Regards michael
>

Re: Kafka compatibility metrics

2020-02-18 Thread Robin Moffatt

Hi,

Can you be a bit more specific about what you're looking for?

thanks, Robin.

-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

On Tue, 18 Feb 2020 at 00:28, Ajith Kumar Rajendran  wrote:

> Hi team,
>
> Need a kafka compatibility metrics details
>
> *Thanks and regards,*
> *Ajith Kumar Rajendran*
> *+91 9080451861*
>

Re: connect to bitnami kafka cluster from djang app-engine

2020-02-07 Thread Robin Moffatt

The error you get puts you on the right lines:

>  Is your advertised.listeners (called advertised.host.name before Kafka
9) correct and resolvable?

This article explains why and how to fix it:
https://rmoff.net/2018/08/02/kafka-listeners-explained/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 7 Feb 2020 at 13:21, Marcus Engene  wrote:

> Hi,
>
> I tried to use kafka-python 1.4.7 to connect to a bitnami kafka cluster
> using private ip the brokers.
>
> This works great from another Compute Instance.
>
> When i try the same code from django on app-engine (that is setup to be
> able to use stuff on Compute, f,ex some locally installed Redis), I get
> an error:
>
> ...
> 2020-02-07 10:38:30 api[20200207t095208]  WARNING:kafka.conn:DNS lookup
> failed for kafka-cluster-1-kafka-2:9092, exception was [Errno -2] Name
> or service not known. Is your advertised.listeners (called
> advertised.host.name before Kafka 9) correct and resolvable?
> 2020-02-07 10:38:30 api[20200207t095208]  ERROR:kafka.conn:DNS lookup
> failed for kafka-cluster-1-kafka-2:9092 (AddressFamily.AF_UNSPEC)
> 2020-02-07 10:38:30 api[20200207t095208]  WARNING:kafka.conn:DNS lookup
> failed for kafka-cluster-1-kafka-2:9092, exception was [Errno -2] Name
> or service not known. Is your advertised.listeners (called
> advertised.host.name before Kafka 9) correct and resolvable?
> 2020-02-07 10:38:30 api[20200207t095208]  ERROR:kafka.conn:DNS lookup
> failed for kafka-cluster-1-kafka-2:9092 (AddressFamily.AF_UNSPEC)
> ...
>
> Code run is very simple:
>
> // conf only has sasl params and brokers in the form of
> "ip:9092,ip2:9092,ip3:9092"
> cls.producer = KafkaProducer(**conf)
>
> future =cls.producer.send(topic=topic,
> value=bytes(value,encoding='utf8'),
> key=bytes(key,encoding='utf8'),
> headers=None,
> partition=None,
> timestamp_ms=None)
> future.get(timeout=10)  # <- in a try/catch
>
> Where does client side even get to hear about kafka-cluster-1-kafka-2? I
> thought client connects to any of the brokers and the will take it from
> there? Compute Engine DNS resolves kafka-cluster-1-kafka-2
> magnificently, but not in app-engine. Should connecting side know more
> about Kafka rig than the brokers?
> Best regards, Marcus
>
>

Re: User interface for kafka: Management/administration

2020-01-23 Thread Robin Moffatt

There's a good presentation from Stephane Maarek that covers tooling,
including UIs:
https://www.confluent.io/kafka-summit-lon19/show-me-kafka-tools-increase-productivity/

You'll find some projects that need to be built also ship with Docker
images that you can then just run.

Here's a useful list of projects to look at:
https://dev.to/guyshiloizy/kafka-administration-and-monitoring-ui-tools-hf4

-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 23 Jan 2020 at 04:10, Sunil CHAUDHARI
 wrote:

> Hi all,
> Please help me to get some user interface for management and
> administration of my kafka cluster.
> There are some open source available, but they either have some
> dependencies or those need to be built before running.
> Is there any pre-build(ready to use package) which I can just download and
> run?
> Our environment have many restrictions, so its difficult to
> download/install dependencies.
>
> I hope you guys understood my problem.
>
> Regards,
> Sunil.
>
> CONFIDENTIAL NOTE:
> The information contained in this email is intended only for the use of
> the individual or entity named above and may contain information that is
> privileged, confidential and exempt from disclosure under applicable law.
> If the reader of this message is not the intended recipient, you are hereby
> notified that any dissemination, distribution or copying of this
> communication is strictly prohibited. If you have received this message in
> error, please immediately notify the sender and delete the mail. Thank you.
>

Re: Stand alone connector SW deployment on a web server

2020-01-16 Thread Robin Moffatt

Not sure what you mean by "SQ", but Kafka Connect workers can indeed be
deployed on their own, connecting to a remote Kafka cluster.
Standalone would make sense in that case, yes.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 16 Jan 2020 at 03:42, George  wrote:

> Hi all
>
> Is it possible to deploy just the connector/worker stack SQ onto say a Web
> server. or a JBOSS server.
>
> Guessing the connector is then configured into standalone mode ?
>
> G
>
> --
> You have the obligation to inform one honestly of the risk, and as a person
> you are committed to educate yourself to the total risk in any activity!
>
> Once informed & totally aware of the risk,
> every fool has the right to kill or injure themselves as they see fit!
>

Re: ingesting web server logs, or log4j log files from a boss server

2020-01-15 Thread Robin Moffatt

If spooldir doesn't suit, there's also
https://github.com/streamthoughts/kafka-connect-file-pulse to check out.
Also bear in mind tools like filebeat from Elastic support Kafka as a
target.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 15 Jan 2020 at 12:48, George  wrote:

> Hi Tom
>
> will do. for now I have 4 specific file types I need to ingest.
>
> 1. reading apache web server log files, http.log's.
> 2. reading in our custom log files
> 3. reading in log4j log files
> 4. mysql connection as a source
> 5. cassandra connection, as a sink
>
> I can not use NFS mounting the source file system to the Connect cluster,
> we don't allow NFS.
>
> I'm hoping to pull #1-#3 in as each line a the value field of a JSON
> message, then maybe use stream process, or kSQL to unpack into a 2nd
> message which can then be consumed, analysed etc.
>
> bit amazed there is not a predefined connector for http logs files though
>
> G
>
>
> On Wed, Jan 15, 2020 at 12:32 PM Tom Bentley  wrote:
>
> > Hi George,
> >
> > Since you mentioned CDC specifically you might want to check out
> Debezium (
> > https://debezium.io/) which operates as a connector of the sort Robin
> > referred to and does CDC for MySQL and others.
> >
> > Cheers,
> >
> > Tom
> >
> > On Wed, Jan 15, 2020 at 10:18 AM Robin Moffatt 
> wrote:
> >
> > > The integration part of Apache Kafka that you're talking about is
> > > called Kafka Connect. Kafka Connect runs as its own process, known as
> > > a Kafka Connect Worker, either on its own or as part of a cluster.
> Kafka
> > > Connect will usually be deployed on a separate instance from the Kafka
> > > brokers.
> > >
> > > Kafka Connect connectors will usually connect to the external system
> over
> > > the network if that makes sense (e.g. a database) but not always (e.g.
> if
> > > its acting as a syslog endpoint, or maybe processing local files).
> > >
> > > You can learn more about Kafka Connect and its deployment model here:
> > > https://rmoff.dev/crunch19-zero-to-hero-kafka-connect
> > >
> > >
> > > --
> > >
> > > Robin Moffatt | Senior Developer Advocate | ro...@confluent.io |
> @rmoff
> > >
> > >
> > > On Wed, 15 Jan 2020 at 03:43, George  wrote:
> > >
> > > > Hi all.
> > > >
> > > > Please advise, a real noob here still, unpacking how the stack still
> > > > works...
> > > >
> > > > if I have a mySQL server, or a web server, or a 2 node JBOSS cluster.
> > > >
> > > > If I want to use the mysql connector to connect to the MySQL DB to
> pull
> > > > data using CDC... then I need to install the Kafka stack on the DB
> > > server,
> > > > I understand that this will be a stand alone install, assume with no
> > > > zookeeper involved.
> > > >
> > > > Similarly for the apache web server and the 2 JBOSS servers
> > > >
> > > > G
> > > >
> > > > --
> > > > You have the obligation to inform one honestly of the risk, and as a
> > > person
> > > > you are committed to educate yourself to the total risk in any
> > activity!
> > > >
> > > > Once informed & totally aware of the risk,
> > > > every fool has the right to kill or injure themselves as they see
> fit!
> > > >
> > >
> >
>
>
> --
> You have the obligation to inform one honestly of the risk, and as a person
> you are committed to educate yourself to the total risk in any activity!
>
> Once informed & totally aware of the risk,
> every fool has the right to kill or injure themselves as they see fit!
>

Re: Any available cassandra source connector for kafka

2020-01-15 Thread Robin Moffatt

Pretty sure WePay are working on this:
https://github.com/debezium/debezium-incubator


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 15 Jan 2020 at 13:55, Sachin Mittal  wrote:

> Hi,
> I was looking for any cassandra source connector to read data from
> cassandra column family and insert the data into kafka topic.
>
> Are you folks aware of any community supported version of such a tool.
>
> I found one such:
> https://docs.lenses.io/connectors/source/cassandra.html
>
> However I am not aware of its source code and license policy.
>
> Please let me know if something equivalent exist which can be used out of
> box without much changes.
>
> Thanks
> Sachin
>

Re: ingesting web server logs, or log4j log files from a boss server

2020-01-15 Thread Robin Moffatt

The integration part of Apache Kafka that you're talking about is
called Kafka Connect. Kafka Connect runs as its own process, known as
a Kafka Connect Worker, either on its own or as part of a cluster. Kafka
Connect will usually be deployed on a separate instance from the Kafka
brokers.

Kafka Connect connectors will usually connect to the external system over
the network if that makes sense (e.g. a database) but not always (e.g. if
its acting as a syslog endpoint, or maybe processing local files).

You can learn more about Kafka Connect and its deployment model here:
https://rmoff.dev/crunch19-zero-to-hero-kafka-connect

-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

On Wed, 15 Jan 2020 at 03:43, George  wrote:

> Hi all.
>
> Please advise, a real noob here still, unpacking how the stack still
> works...
>
> if I have a mySQL server, or a web server, or a 2 node JBOSS cluster.
>
> If I want to use the mysql connector to connect to the MySQL DB to pull
> data using CDC... then I need to install the Kafka stack on the DB server,
> I understand that this will be a stand alone install, assume with no
> zookeeper involved.
>
> Similarly for the apache web server and the 2 JBOSS servers
>
> G
>
> --
> You have the obligation to inform one honestly of the risk, and as a person
> you are committed to educate yourself to the total risk in any activity!
>
> Once informed & totally aware of the risk,
> every fool has the right to kill or injure themselves as they see fit!
>

Re: Public IPs for brokers

2020-01-15 Thread Robin Moffatt

If you use a load balancer bear in mind the importance of the
advertised.listeners setting:
https://rmoff.net/2018/08/02/kafka-listeners-explained/


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 15 Jan 2020 at 08:15, Himanshu Shukla 
wrote:

> is it recommended to use ALB(running all 3 nodes on ec2 instances) or
> something similar?
>
> Thanks and Regards,
> Himanshu Shukla
>
>
>
>
> On Wed, Jan 15, 2020 at 12:52 PM Tom Bentley  wrote:
>
> > Hi Himanshu,
> >
> > Short answer: yes.
> >
> > The way a Kafka client works is to connect to one of the given bootstrap
> > brokers and ask it about the rest of the cluster. The client then
> connects
> > to those brokers as necessary. So in general a client will need to
> connect
> > to all brokers in the cluster and therefore the broker IPs need to be
> > routable from the client.
> >
> > Cheers,
> >
> > Tom
> >
> > On Wed, Jan 15, 2020 at 5:10 AM Himanshu Shukla <
> > himanshushukla...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > do I have to necessarily have public IPs for all the brokers? Since we
> > can
> > > give a few of the IPs in bootstrap.servers config.
> > >
> >
>

Re: Free Kafka Stream Data

2020-01-08 Thread Robin Moffatt

Hi,

This blog shows using Twitter as a source for Kafka:
https://www.confluent.io/blog/stream-processing-twitter-data-with-ksqldb/
The credentials necessary for that Twitter API are created instantly AFAIK.

If you just want messages to be generated in Kafka have a look at
https://www.confluent.io/blog/easy-ways-generate-test-data-kafka/

-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

On Wed, 8 Jan 2020 at 20:42, cool girl  wrote:

> Hi ,
>
> I am trying to learn Kafka. Is there any free API which I can use like
> twitter? I created twitter account but looks like ti will take days before
> I can use their streaming data .
>
> Thanks
> Priyanka
>

Re: Which container should you use when deploying on docker ?

2019-12-16 Thread Robin Moffatt

There are various Kafka images available, including:

https://hub.docker.com/r/confluentinc/cp-kafka/
<https://hub.docker.com/r/wurstmeister/kafka/>
https://hub.docker.com/r/wurstmeister/kafka/

I'm not 100% clear what your doubt is? If these are legitimate Kafka
images, or something else?
<https://hub.docker.com/r/confluentinc/cp-kafka/>


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Sat, 14 Dec 2019 at 04:18, Yu Watanabe  wrote:

> Hello.
>
> I would like to ask question related to kafka on docker engine.
> Which container should you use for kafka when deploying on docker in
> production ?
>
> When I look in docker hub , I do not see neither of below tagged for kafka
> container .
>
> Docker certified
> Verified publisher
> Official Images
>
> Repository "confluent" seems be the closest one since its the creator of
> kafka but it does not have above tag .
>
> Thanks,
> Yu Watanabe
>
> --
> Yu Watanabe
> Weekend Freelancer who loves to challenge building data platform
> yu.w.ten...@gmail.com
> [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
> Twitter icon] <https://twitter.com/yuwtennis>
>

Re: Powered By: add BookingSync and Smily

2019-12-06 Thread Robin Moffatt

Hi Karol,

Thanks for submitting these - I'll arrange for a PR to be created to add
them to the page.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 6 Dec 2019 at 08:50, Karol Galanciak  wrote:

> Hi,
>
> Could you please add BookingSync (https://www.bookingsync.com <
> https://www.bookingsync.com/>) and Smily (https://www.smily.com <
> https://www.smily.com/>) to Powered By section?. The description for both:
>
> BookingSync: Apache Kafka is used as a backbone of data synchronization
> and propagating changes of entitites/aggregates between multiple
> applications in BookingSync ecosystem.
> Smily: Apache Kafka is used as a backbone of data synchronization and
> propagating changes entitites/aggregates between multiple applications in
> Smily ecosystem.
>
> The logos you could use:
> BookingSync:
> https://cdn.bookingsync.io/libs/clearbox/1.2.11/assets/images/bookingsync.svg
> <
> https://cdn.bookingsync.io/libs/clearbox/1.2.11/assets/images/bookingsync.svg
> >
> Smily:
> https://assets.website-files.com/5c8ca7584e2db8cfcdaee1de/5c8ca7e22f2b6792fdb8af92_smily-logo.svg
> <
> https://assets.website-files.com/5c8ca7584e2db8cfcdaee1de/5c8ca7e22f2b6792fdb8af92_smily-logo.svg
> >
>
>
> Thanks,
> --
> Karol Galanciak
> CTO
> www.BookingSync.com <https://www.bookingsync.com/> - Vacation Rental
> Software
> twitter: twitter.com/BookingSync <https://twitter.com/BookingSync> - fb:
> fb.com/BookingSync <https://www.facebook.com/bookingsync/> - video: in
> English <https://www.youtube.com/watch?v=VKVKBNrH3No>, en Français <
> https://www.youtube.com/watch?v=kV9MTFVTh3I> (1:46min)

Re: kafka connect incoming port

2019-11-26 Thread Robin Moffatt

Kafka Connect workers listen on the port defined in *rest.port* in the
worker configuration - typically 8083.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Mon, 25 Nov 2019 at 18:41, Johnson, Paul 
wrote:

> Hello,
>
> I am working on setting up a Kafka Connect JMS Source to pull data from an
> external IBM MQ and push to an internal Kafka topic.  We will have a
> firewall between the external MQ and the internal Connect so I need to do a
> firewall exception.  The network people are asking for the incoming port
> for Kafka Connect but I'm not able to find any information.  From what I've
> read it seems like it's done over HTTP (so port 80).  I also see port 49333
> being used by my local Connect.
>
> Is there a port number for incoming traffic to Connect?
>
> Thank you,
> Paul Johnson
>

Re: help for regarding my question

2019-11-15 Thread Robin Moffatt

We can try, but you'll have to tell us what the problem is :)
This is a good checklist for asking a good question:
http://catb.org/~esr/faqs/smart-questions.html#beprecise


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 14 Nov 2019 at 05:04, prashanth sri 
wrote:

> i am new to kafka , can any one help me , regarding my problem.
>

Re: Install kafka-connect-storage-cloud

2019-11-12 Thread Robin Moffatt

Hi Miguel!

If you're using Kafka Connect in standalone mode then you need to pass it a
.properties (key=value) file, not JSON.
JSON is if you are using Kafka Connect in distributed mode (which
personally I advocate, even on a single node), if you use that mode then
you pass the JSON to the REST API after starting the worker.

See https://rmoff.dev/berlin19-kafka-connect for examples and discussion


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 12 Nov 2019 at 16:40, Miguel Silvestre  wrote:

> Hi,
>
> I'm new to kafka (really newbie) and I'm trying to set this connector on my
> local machine which is a macOS Mojava 10.14.6.
>
> I've downloaded the connector and put the contents on folder:
> /usr/local/share/kafka/plugins
> and update the plugin.path on file
> /usr/local/etc/kafka/connect-standalone.properties to:
> /usr/local/share/kafka/plugins
>
> I'm launching the connector like this:
> /usr/local/Cellar/kafka/2.3.1/bin/connect-standalone
> /usr/local/etc/kafka/connect-standalone.properties
> /Users/miguel.silvestre/meetups-to-s3.json
>
> However I'm always getting the error bellow.
> Any idea on what am I doing wrong?
>
> Thank you
> Miguel Silvestre
>
> PS. I need a sink connector that reads json from kafka topics and writes to
> s3 on parquet files. I need to read several topics and the files are going
> to the same bucket on different paths. Do you anything that can do the
> task? It seems that secor is having building issues right now.
>
>
>
> [2019-11-12 16:24:19,322] INFO Kafka Connect started
> (org.apache.kafka.connect.runtime.Connect:56)
> [2019-11-12 16:24:19,325] ERROR Failed to create job for
> /Users/miguel.silvestre/meetups-to-s3.json
> (org.apache.kafka.connect.cli.ConnectStandalone:110)
> [2019-11-12 16:24:19,326] ERROR Stopping after connector error
> (org.apache.kafka.connect.cli.ConnectStandalone:121)
> java.util.concurrent.ExecutionException:
> org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector
> config {"s3.part.size"="5242880",, "partition.duration.ms"="360",, "
> s3.bucket.name"="test-connector",,
> "value.converter.schemas.enable"="false",, "timezone"="UTC",, },=,
>
> "partitioner.class"="io.confluent.connect.storage.partitioner.TimeBasedPartitioner",,
> "path.format"="'date'=-MM-dd/'hour'=HH",, "rotate.interval.ms
> "="6",,
> "name"="meetups-to-s3",, "flush.size"="10",,
> "key.converter.schemas.enable"="false",,
> "value.converter"="org.apache.kafka.connect.json.JsonConverter",,
> "topics"="test",, "tasks"=[], "config"={,
> "connector.class"="io.confluent.connect.s3.S3SinkConnector",,
> "format.class"="io.confluent.connect.s3.format.json.JsonFormat",,
> "tasks.max"="1",, "s3.region"="eu-west-1",,
> "key.converter"="org.apache.kafka.connect.json.JsonConverter",,
> "timestamp.extractor"="Record", "locale"="en",,
> "schema.compatibility"="NONE",, {=,
> "storage.class"="io.confluent.connect.s3.storage.S3Storage",, }=} contains
> no connector type
> at
>
> org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:79)
> at
>
> org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:66)
> at
>
> org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:118)
> Caused by:
> org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector
> config {"s3.part.size"="5242880",, "partition.duration.ms"="360",, "
> s3.bucket.name"="test-connector",,
> "value.converter.schemas.enable"="false",, "timezone"="UTC",, },=,
>
> "partitioner.class"="io.confluent.connect.storage.partitioner.TimeBasedPartitioner",,
> "path.format"="'date'=-MM-dd/'hour'=HH",, "rotate.interval.ms
> "="6",,
> "name"="meetups-to-s3",, "flush.size"="10",,
> "key.converter.schemas.enable"="false",,
> "value.converter"="org.apache.kafka.connect.json.JsonConverter",,
> "topics"="test",, "tasks"=[], "config"={,
> "connector.class"="io.confluent.connect.s3.S3SinkConnector",,
> "format.class"="io.confluent.connect.s3.format.json.JsonFormat",,
> "tasks.max"="1",, "s3.region"="eu-west-1",,
> "key.converter"="org.apache.kafka.connect.json.JsonConverter",,
> "timestamp.extractor"="Record", "locale"="en",,
> "schema.compatibility"="NONE",, {=,
> "storage.class"="io.confluent.connect.s3.storage.S3Storage",, }=} contains
> no connector type
> at
>
> org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:287)
> at
>
> org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:192)
> at
>
> org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:115)
> [2019-11-12 16:24:19,329] INFO Kafka Connect stopping
> (org.apache.kafka.connect.runtime.Connect:66)
> --
> Miguel Silvestre
>

Re: Monitor Kafka connect jobs

2019-10-25 Thread Robin Moffatt

You can use the REST API to monitor the status of connectors and their
tasks. As of AK 2.3 it's been improved so you can poll
http://localhost:8083/connectors?expand=info&expand=status and use the
returned data to understand the status. You can parse it and do something
like this:

curl -s "http://localhost:8083/connectors?expand=info&expand=status"; | \
 jq '. | to_entries[] | [ .value.info.type, .key,
.value.status.connector.state,.value.status.tasks[].state,.value.info.config."connector.class"]|join(":|:")'
| \
 column -s : -t| sed 's/\"//g'| sort

Which returns:

source  |  source-debezium-mysql-02  |  RUNNING  |  RUNNING  |
 io.debezium.connector.mysql.MySqlConnector


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 24 Oct 2019 at 15:52, KhajaAsmath Mohammed 
wrote:

> Hi,
>
>
> We are using kafka connect in production and I have few questions about it.
> when we submit kafka connect job using rest api . Job gets continously
> running in the background and due to some issues, lets say we restarted
> kafka cluster. do we need to start manually all the jobs again?
>
>
> Is there a way to monitor these jobs using tools. I know we can use connect
> UI but in case if we have more than 1000 jobs it would become more complex.
>
>
> I am also looking for trigger mechanism to send out email or alert support
> team if the connector was killed due to some reasons.
>
>
> Thanks,
>
> Asmath
>

Re: Ksql with schema registry

2019-10-07 Thread Robin Moffatt

You can specify VALUE_FORMAT='AVRO' to use Avro serialisation and
the Schema Registry. See docs for more details
https://docs.confluent.io/current/ksql/docs/installation/server-config/avro-schema.html

-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff

On Sat, 5 Oct 2019 at 20:37, KhajaAsmath Mohammed 
wrote:

> Hi,
>
> What is the configuration that I need to follow to register schema
> registry on ksql client installed in my machine that connects to cluster.
>
> Creating stream with select should create schema automatically in schema
> registry .
>
> Thanks,
> Asmath
>
> Sent from my iPhone

Re: Kafka Connect issues when running in Docker

2019-07-19 Thread Robin Moffatt

Do you get any output from *docker logs*? And does it work if you don't use
authentication?
How about if you try one of the dockerised Kafka Connect examples here?
https://github.com/confluentinc/demo-scene/tree/master/kafka-connect-zero-to-hero


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 19 Jul 2019 at 10:53, Aleksandar Irikov  wrote:

> Hey everyone,
>
> I am having some issues when running Kafka Connect inside of Docker, so
> would really appreciate some feedback on what I'm doing wrong.
>
> I'm able to run locally (by executing `connect-distributed
> config.properties`. However, when running in Docker and passing the same
> configuration as environment variables, after Kafka Connect has started and
> starts assigning tasks to workers, the container just exits without any
> error.
>
> Local setup:
>  - I have Kafka installed locally, so I'm using the `connect-distributed`
> script along with my properties file, which are listed below. This seems to
> work without issues
>
> # Kafka broker IP addresses to connect to
> bootstrap.servers=kafka:9092
> ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
> ssl.truststore.location=truststore.jks
> ssl.truststore.password=some-password
> ssl.protocol=TLS
> security.protocol=SASL_SSL
> ssl.endpoint.identification.algorithm=
> sasl.mechanism=SCRAM-SHA-256
> sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule
> required \
> username="kafka_username" \
> password="kafka_password";
> consumer.ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
> consumer.ssl.truststore.location=truststore.jks
> consumer.ssl.truststore.password=some-password
> consumer.ssl.protocol=TLS
> consumer.security.protocol=SASL_SSL
> consumer.ssl.endpoint.identification.algorithm=
> consumer.sasl.mechanism=SCRAM-SHA-256
> # CONFIGURE USER AND PASSWORD!
>
> consumer.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule
> required \
>
> username="kafka_username" \
> password="kafka_password";
>
>
> group.id=my-group
> config.storage.topic=kconnect.config
> offset.storage.topic=kconnect.offsets
> status.storage.topic=kconnect.status
>
>
> # Path to directory containing the connect jar and dependencies
> plugin.path=path/to/my/plugins/
> # Converters to use to convert keys and values
> key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
> value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
>
> # The internal converters Kafka Connect uses for storing offset and
> configuration data
> internal.key.converter=org.apache.kafka.connect.json.JsonConverter
> internal.value.converter=org.apache.kafka.connect.json.JsonConverter
> internal.key.converter.schemas.enable=false
> internal.value.converter.schemas.enable=false
> offset.storage.file.filename=/tmp/kconnect.offsets
> rest.port=
>
>
> Dockerized setup:
>  - I'm using the Confluent Kafka Connect Base Image (I've also tried with
> the "fat" Confluent image), to which I'm passing my configuration as
> environment variables when I run the container.
> Dockerfile:
>
> FROM confluentinc/cp-kafka-connect-base
>
> COPY file-sink.txt /output/file-sink.txt
>
> COPY truststore.jks /trustores/
>
>
> Docker run with environs:
>
> docker run --rm --name kconnect -p 0.0.0.0:: \
> -e CONNECT_BOOTSTRAP_SERVERS=kafka:9092 \
> -e CONNECT_SSL_PROTOCOL=TLS \
> -e CONNECT_ENABLED_PROTOCOLS=TLSv1.2,TLSv1.1,TLSv1 \
> -e CONNECT_SSL_TRUSTSTORE_LOCATION=/truststores/truststore.jks \
> -e CONNECT_SSL_TRUSTSTORE_PASSWORD=instaclustr \
> -e CONNECT_SECURITY_PROTOCOL=SASL_SSL \
> -e
> CONNECT_SASL_JAAS_CONFIG='org.apache.kafka.common.security.scram.ScramLoginModule
> required username="KAFKA_USERNAME" password="KAFKA_PASSWORD";' \
> -e CONNECT_SASL_MECHANISM=SCRAM-SHA-256 \
> -e CONNECT_GROUP_ID=my-group \
> -e CONNECT_CONFIG_STORAGE_TOPIC="kconnect.config" \
> -e CONNECT_OFFSET_STORAGE_TOPIC="kconnect.offsets" \
> -e CONNECT_STATUS_STORAGE_TOPIC="kconnect.status" \
> -e
> CONNECT_KEY_CONVERTER="org.apache.kafka.connect.converters.ByteArrayConverter"
> \
> -e
> CONNECT_VALUE_CONVERTER="org.apache.kafka.connect.converters.ByteArrayConverter"
> \
> -e
> CONNECT_INTERNAL_KEY_CONVERTER="org.apache.kafka.connect.json.JsonConverter"
> \
> -e
> CONNECT_INTERNAL_VALUE_CONVERTER="org.apache.kafka.connect.json.JsonConverter"
> \
> -e CONNECT_REST_ADVERTISED_HOST_NAME=localhost \

Re: quick start does not work...

2019-05-23 Thread Robin Moffatt

Hi Tom,

Can you link to which quickstart instructions you're following?

thanks.


-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 23 May 2019 at 07:35, Tom Kwong  wrote:

> Hi,
> I’m getting the following error (NoNode for /brokers) on a fresh new
> install of kafka.  There are more similar errors after that.  Is there any
> specific requirements to run Kafka?
>
> [2019-05-23 06:13:58,199] INFO Accepted socket connection from /
> 127.0.0.1:47078 (org.apache.zookeeper.server.NIOServerCnxnFactory)
> [2019-05-23 06:13:58,210] INFO Client attempting to establish new session
> at /127.0.0.1:47078 (org.apache.zookeeper.server.ZooKeeperServer)
> [2019-05-23 06:13:58,213] INFO Creating new log file: log.1
> (org.apache.zookeeper.server.persistence.FileTxnLog)
> [2019-05-23 06:13:58,228] INFO Established session 0x1025f9887e0 with
> negotiated timeout 6000 for client /127.0.0.1:47078
> (org.apache.zookeeper.server.ZooKeeperServer)
> [2019-05-23 06:13:58,322] INFO Got user-level KeeperException when
> processing sessionid:0x1025f9887e0 type:create cxid:0x2 zxid:0x3
> txntype:-1 reqpath:n/a Error Path:/brokers Error:KeeperErrorCode = NoNode
> for /brokers (org.apache.zookeeper.server.PrepRequestProcessor)
>
> I’m on AWS Linux
> Linux ip-172-31-4-54 4.14.88-72.76.amzn1.x86_64 #1 SMP Mon Jan 7 19:47:07
> UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Re: Kafka Connect - HDFS or FileStream

2019-05-12 Thread Robin Moffatt

Can you explain more about why you're writing a file with the data?
Presumably, this is for another application to consume; could it not take
the data from Kafka directly, whether with a native client or over the REST
proxy?
Oftentimes local files are unnecessary 'duck tape' for integration that can
be done in a better way.

-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff

On Mon, 13 May 2019 at 01:35, Vinay Jain  wrote:

> Hi
>
> I would like to consume a Topic and save the AVRO messages in local
> directory in the AVRO file formats. As per Kafka Connect File Stream
> documentation, it is not for production use.
>
> Other Option I am thinking to use Kafka Connect - HDFS Sink but I am not
> sure if it can also write to the Local directory if we pass in the variable
> hdfs.url the URL for local file system instead of HDFS path.
>
> Will this work or are there any other ready made options which can be used
> for the same.
>
> Regards
>
> Vinay
>

Re: Streaming Data

2019-04-10 Thread Robin Moffatt

+1 for looking into Kafka Streams. You can build and maintain state within
your app, and expose it on a REST endpoint for your node/react app to
query.
There's an example of this here:
https://github.com/confluentinc/kafka-streams-examples/blob/5.2.1-post/src/main/java/io/confluent/examples/streams/interactivequeries/kafkamusic/KafkaMusicExample.java

Depending on the processing you want to do KSQL could also be useful,
writing the results of queries (e.g. spotting exceptions in the data) to a
Kafka topic that can be queried by your app.


-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 9 Apr 2019 at 21:26, Nick Torenvliet  wrote:

> Hi all,
>
> Just looking for some general guidance.
>
> We have a kafka -> druid pipeline we intend to use in an industrial setting
> to monitor process data.
>
> Our kafka system recieves messages on a single topic.
>
> The messages are {"timestamp": yy:mm:ddThh:mm:ss.mmm, "plant_equipment_id":
> "id_string", "sensorvalue": float}
>
> For our POC there are about 2000 unique plant_equipment ids, this will
> quickly grow to 20,000.
>
> The kafka topic streams into druid
>
> We are building some node.js/react browser based apps for analytics and
> real time stream monitoring.
>
> We are thinking that for visualizing historical data sets we will hit druid
> for data.
>
> For real time streaming we are wondering what our best option is.
>
> One option is to just hit druid semi regularly and update the on screen
> visualization as data arrives from there.
>
> Another option is to stream subset of the topics (somehow) from kafka using
> some streams interface.
>
> With all the stock ticker apps out there, I have to imagine this is a
> really common use case.
>
> Anyone have any thoughts as to what we are best to do?
>
> Nick
>

Re: Kafka Connect Skip Messages

2019-01-25 Thread Robin Moffatt

It should be possible using a custom Single Message Transform (
https://docs.confluent.io/current/connect/javadocs/org/apache/kafka/connect/transforms/Transformation.html).
There is a good talk here about when SMTs are appropriate (and when not):
https://kafka-summit.org/sessions/single-message-transformations-not-transformations-youre-looking/

You might also want to look at Kafka Streams for applying logic to a stream
before you ingest it with Kafka Connect.


-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 25 Jan 2019 at 13:29, Sönke Liebau
 wrote:

> Hi Martin,
>
> I don't think that Connect offers this functionality as is. However
> you can implement custom SMTs, which should allow you to do what need.
> Alternatively you could write a ConsumerInterceptor that filters out
> messages based on your criteria - but: this would affect all running
> Connectors as this setting can only be set for the entire Connect
> worker.
> Depending on your exact needs one of the two approaches might be
> preferable. If this filter has to be applied to everything that
> Connect reads than the Interceptor might be preferrable, as you'll
> only have to define it once and then not specify it again for every
> connector.
>
> Hope that helps.
>
> Best regards,
> Sönke
>
> On Fri, Jan 25, 2019 at 1:58 PM mbschorer  wrote:
> >
> > Hi!
> >
> > I am evaluating the suitability of Kafka Connect to leverage our
> internal event sourcing system.
> > Since custom deserializers are supported, it looks quite promising so
> far.
> >
> > One open question is, whether Kafka Connect is capable of filtering out,
> i.e. skipping certain messages based on conditions on the value.
> > Or does it pursue a take all or nothing approach?
> >
> > All I could find so far is single message transform, which didn't reveal
> this feature.
> > Does anybody in this mailing list have more knowledge regarding this?
> >
> > Thanks in advance!
> > Martin
> >
>
>
> --
> Sönke Liebau
> Partner
> Tel. +49 179 7940878
> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
>

Re: Kafka servers starting and stopping abnormally

2019-01-14 Thread Robin Moffatt

It would help if you provide a bit more detail. For example, you mention an
OS upgrade - but what OS? from what version to what version?
Do you see any errors in the broker logs at the time at which they shut
down?


-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff


On Mon, 14 Jan 2019 at 04:24, Kaushik Nambiar 
wrote:

> Hello,
> I am having a 3 node -3 zookeeper cluster.
> My Kafka version is 0.11.x.x.
> I am seeing that my Kafka servers start and stop frequently.
> We can see atleast 25 start-stop activity per minute.
>
> This abnormal behaviour started to happen after we had an os upgrade for
> the underlying VMs.
>
> The is upgrade was performed non multiple similar clusters.But only one of
> the clusters seems to behave in this manner
>
> Due to the continuous starting and stopping of the machines,I can also see
> increased offline partition count and unclean leader elections.
>
> Any specific reason for this issue ?
> Also can you guys suggest a solution ?
>
> We tried stopping all the Kafka nodes and starting it again.
> It would stop okay,but once they start up,we could see the same problem .
>
> Regards,
> Kaushik Nambiar
>

Re: Kafka data log timestamp

2019-01-10 Thread Robin Moffatt

You can use kafkacat to examine the timestamp (and other metadata). Here's
an example of calling it, and two sample output records:

$ kafkacat -b localhost:9092 -t mysql_users -C -c2 -f '\nKey (%K bytes):
%k\t\nValue (%S bytes): %s\nTimestamp: %T\tPartition: %p\tOffset: %o\n--\n'

Key (1 bytes): 1
Value (79 bytes):
{"uid":1,"name":"Cliff","locale":"en_US","address_city":"St
Louis","elite":"P"}
Timestamp: 1520618381093Partition: 0Offset: 0
--

Key (1 bytes): 2
Value (79 bytes):
{"uid":2,"name":"Nick","locale":"en_US","address_city":"Palo
Alto","elite":"G"}
Timestamp: 1520618381093Partition: 0Offset: 1
--


-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 10 Jan 2019 at 10:13, Parth Gandhi <
parth.gan...@excellenceinfonet.com> wrote:

> Hi,
>
> Does kafka record the timestamp of the incoming message in its data log? I
> checked one of the partition log and I can see the message without any
> timestamp. Also there are few special characters in the message log. IS
> that normal?
>
> Here is a sample log: pastebin.com/hStyCW13
>
> Thanks,
> Parth Gandhi
> DevOps
>
> Disclaimer
>
> The information contained in this communication from the sender is
> confidential. It is intended solely for use by the recipient and others
> authorized to receive it. If you are not the recipient, you are hereby
> notified that any disclosure, copying, distribution or taking action in
> relation of the contents of this information is strictly prohibited and may
> be unlawful.
>
> This email has been scanned for viruses and malware, and may have been
> automatically archived by Mimecast Ltd, an innovator in Software as a
> Service (SaaS) for business. Providing a safer and more useful place for
> your human generated data. Specializing in; Security, archiving and
> compliance. To find out more visit the Mimecast website.
>

Re: Magic byte error when trying to consume Avro data with Kafka Connect

2018-12-07 Thread Robin Moffatt

> However, we can't figure out a way to turn off key deserialization (if
that is what is causing this) on the kafka connect/connector side.

Set *key.converter* to the correct value for the source message.

https://www.confluent.io/blog/kafka-connect-deep-dive-converters-serialization-explained


-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff


On Thu, 6 Dec 2018 at 18:30, Marcos Juarez  wrote:

> We're trying to use Kafka Connect to pull down data from Kafka, but we're
> having issues with the Avro deserialization.
>
> When we attempt to consume data using the kafka-avro-console-consumer, we
> can consume it, and deserialize it correctly.  Our command is similar to
> the following:
>
> *./kafka-avro-console-consumer --topic my_topic --bootstrap-server
> localhost:9092 --property schema.registry.url=http://localhost:8081
> <http://localhost:8081>*
>
> However, if we attempt to use Kafka Connect to consume the data (in our
> case, with an S3SinkConnector), we always get the following error from the
> connector:
>
> "state": "FAILED",
>
> "trace": "org.apache.kafka.connect.errors.DataException:
> cf4_sdp_rest_test_v2
>
> at
>
> io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:97)
>
> at
>
> org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:467)
>
> at
>
> org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:301)
>
> at
>
> org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:205)
>
> at
>
> org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:173)
>
> at
> org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
>
> at
> org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
> *Caused by: org.apache.kafka.common.errors.SerializationException: Error
> deserializing Avro message for id -1*
>
> *Caused by: org.apache.kafka.common.errors.SerializationException: Unknown
> magic byte!*
>
> We've tried several different configurations, but we are not able to get
> Kafka Connect to properly consume these messages.  For reference, these
> same messages work correctly with Apache Camus, and of course with the
> command-line kafka-avro-console-consumer as described above. So we're
> pretty confident that the messages are built correctly.
>
> We have noticed that we get the same error if we attempt to print the
> messages in the console consumer with the option *--property
> print.key=false*.  However, we can't figure out a way to turn off key
> deserialization (if that is what is causing this) on the kafka
> connect/connector side.
>
> We're using Kafka 1.1.1, and all the packages are Confluent platform 4.1.2.
>
> Any help would be appreciated.
>
> Thanks,
>
> Marcos Juarez
>

Re: Problem with Kafka Connector

2018-12-06 Thread Robin Moffatt

If the properties are not available per-connector, then you will have to
set them on the worker and have independent Kafka Connect clusters
delineated by connector requirements. So long as you configure the ports
not to clash, there's no reason these can't exist on the same host.

-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff

On Wed, 5 Dec 2018 at 10:19, Федор Чернилин  wrote:

> Hello! I have question. We have cluster with several connect workers. And
> we have many different connectors. We need to set for each connector its
> own settings,  max.in.flight.requests.per.connection , partitioner.class,
> acks. But I have difficulties. How can I do that? Thanks

Re: Add company name on the Powered by page

2018-11-02 Thread Robin Moffatt

Hi Rohan,

Please contact a...@confluent.io and she can arrange this.

thanks, Robin.

-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff

On Fri, 26 Oct 2018 at 18:26, Rohan Rasane  wrote:

> Hi,
> What is the email address, if I want to add ServiceNow to the powered by
> page?
>
> -Rohan
>

Re: How to config the servers when Kafka cluster is behind a NAT?

2018-09-17 Thread Robin Moffatt

This should help: https://rmoff.net/2018/08/02/kafka-listeners-explained/


-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff


On Mon, 17 Sep 2018 at 12:18, XinYi Long  wrote:

> Hello, Guys,
>
> I have a Kafka cluster, which is behind a NAT.
>
> I have some devices on the internet, to play as consumers and
> producers. And I also have some applications which are in the same LAN with
> the Kafka cluster, and play as consumers and producers.
>
> I have changed the "advertised.listeners" to "PLAINTEXT://{NAT
> IP}:{NAT port}", and add some routes on the servers. Because I found that
> Kafka brokers also use the "advertised.listeners" to talk with each other.
> Am I right?
>
> When I start a consumer in the same LAN, I found it can receive
> metadata correctly. But it can't consume any message from other producer in
> the same LAN.
>
> Did I miss anything, and how to make it work?
>
>
> Thank you very much!
>
>
> lxyscls
>

Re: Need info

2018-09-14 Thread Robin Moffatt

As a side note from your question, I'd recommend looking into Kafka
Connect. It is another API with Apache Kafka, and it simplifies the
building of pipelines with Kafka such as the one you are describing
There are pre-built connectors, including for S3 (
https://www.confluent.io/connector/kafka-connect-s3/). There are also
options for pulling from the mainframe, depending on specific platform and
system.

-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff

On Wed, 12 Sep 2018 at 08:32, Chanchal Chatterji <
chanchal.chatte...@infosys.com> wrote:

> Hi,
>
> In the process of mainframe modernization, we are attempting to stream
> Mainframe data to AWS Cloud , using Kafka.  We are planning to use Kafka
> 'Producer API' at mainframe side and 'Connector API' on the cloud side.
> Since our data is processed by a module called 'Central dispatch' located
> in Mainframe and is sent to Kafka.  We want to know what is rate of volume
> Kafka can handle.  The other end of Kafka is connected to an AWS S3 Bucket
> As sink.  Please help us to provide this info or else please help to
> connect with relevant person who can help us to understand this.
>
> Thanks and Regards
>
> Chanchal Chatterji
> Principal Consultant,
> Infosys Ltd.
> Electronic city Phase-1,
> Bangalore-560100
> Contacts : 9731141606/ 8105120766
>
>

Re: Exposing Kafka on WAN

2018-08-31 Thread Robin Moffatt

Shameless plug for an article I wrote on understanding Kafka listeners
better, which might help in this case:
https://rmoff.net/2018/08/02/kafka-listeners-explained/




On Thu, 30 Aug 2018 at 14:14, Andrew Otto  wrote:

> The trouble is that the producer and consumer clients need to discover the
> broker hostnames and address the individual brokers directly.  There is an
> advertised.listeners setting that will allow you to tell clients to connect
> to external proxy hostnames instead of your internal ones, but those
> proxies will need to know how to map directly from an advertised hostname
> to an internal kafka broker hostname.  You’ll need some logic in your proxy
> to do that routing.
>
> P.S.  I’ve not actually set this up before, but this is my understanding :)
>
>
>
> On Thu, Aug 30, 2018 at 7:16 AM Dan Markhasin  wrote:
>
> > Usually for such a use case you'd have a physical load balancer box (F5,
> > etc.) in front of Kafka that would handle the SSL termination, but it
> > should be possible with NGINX as well:
> >
> >
> >
> https://docs.nginx.com/nginx/admin-guide/security-controls/terminating-ssl-tcp/
> >
> > On Fri, 24 Aug 2018 at 18:35, Jack S  wrote:
> >
> > > Thanks Ryanne.
> > >
> > > That's one of the options we had considered. I was hoping to keep
> > solution
> > > simple and efficient. With HTTP proxy, we would have to worry about
> > > configurations, scalability, and operation. This is probably true with
> > > proxy solution as well, but at least my thinking was that deploying
> proxy
> > > would be more standard with less management effort on our side. Also,
> we
> > > are very familiar with Kafka usual producer/consumer usage, operation,
> > etc.
> > > and could re-use much of our producer and consumer infrastructure that
> we
> > > currently use internally.
> > >
> > > Having said that, this is where I was hoping to hear and get feedback
> > from
> > > community - what people have deployed with such use case and any
> > learnings
> > > and suggestions.
> > >
> > > On Fri, Aug 24, 2018 at 7:42 AM Ryanne Dolan 
> > > wrote:
> > >
> > > > Can you use a Kafka HTTP proxy instead of using the Kafka protocol
> > > > directly?
> > > >
> > > > Ryanne
> > > >
> > > > On Thu, Aug 23, 2018, 7:29 PM Jack S 
> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > We have a requirement for opening Kafka on WAN where external
> > producers
> > > > and
> > > > > consumers need to be able to talk to Kafka. I was able to get
> > Zookeeper
> > > > and
> > > > > Kafka working with two way SSL and SASL for authentication and ACL
> > for
> > > > > authorization.
> > > > >
> > > > > However, my concern with this approach was opening up Kafka brokers
> > > > > directly on WAN and also doing SSL termination. Is there a proxy
> > > > solution,
> > > > > where proxies live in front of Kafka brokers, so Kafka brokers are
> > > still
> > > > > hidden and proxies take care of SSL? Has anyone in the community
> have
> > > > > similar use case with Kafka, which is deployed in production? Would
> > > love
> > > > to
> > > > > find out your experience, feedback, or recommendation for this use
> > > case.
> > > > >
> > > > > Thanks in advance.
> > > > >
> > > > > PS: We are using AWS.
> > > > >
> > > >
> > >
> >
>

Re: Any one using oracle event hub - dedicated ?

2018-07-25 Thread Robin Moffatt

Compared to other managed solutions, or compared to running it yourself?

On Wed, 25 Jul 2018 at 12:44, Vaibhav Jain  wrote:

> Hi There, Any pros/cons ?  Regards, Vaibhav

-- 


-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff

Re: how to use SMT to transform a filed value in kafka connect?

2018-07-10 Thread Robin Moffatt

ReplaceField works with fields themselves, not the values.

You could either write your own SMT to do what you want, or you could use
KStreams or KSQL to process the data in the topic once ingested through
Connect. The simplest way would be with the CASE statement
<https://github.com/confluentinc/ksql/issues/620> (
https://github.com/confluentinc/ksql/issues/620)


-- 

Robin Moffatt | Developer Advocate | ro...@confluent.io | @rmoff


On Tue, 10 Jul 2018 at 03:48, 刘波 <1091643...@qq.com> wrote:

> Hi all,
> I'm new to kafak. I'm using kafka to build a ETL program.
>
>
> Here's my scinaro,
> My jdbc connect  table t_user have a column sex which is int in mysql,
> and use 0 to represent male, 1 to represent female.
>
>
> I want change sex value when 0 then 100, when 1 then 200.
>
>
> below is my connecter config:
> {
> "name":"trans-sex",
> "config":{
> "connector.class":"io.confluent.connect.jdbc.JdbcSourceConnector",
> "connection.url":"jdbc:mysql://
> 10.4.89.214:3306/test?user=root&password=root",
> "mode":"timestamp",
> "table.whitelist":"t_user",
> "validate.non.null":false,
> "timestamp.column.name":"login_time",
> "topic.prefix":"nobinlog.",
>
> "transforms": "convert_sex",
> "transforms.convert_sex.type":
> "org.apache.kafka.connect.transforms.ReplaceField$Value",
> "transforms.convert_sex.renames": "0:100,1,200"
>
>
> }
> }
>
>
>
> any ideas,
> thanks a lot

92 matches

Mail list logo