2018-12-03 09:12:37 UTC - Ivan Kelly: could you give an example of what a table
topic and domain topic would be?
----
2018-12-03 09:44:30 UTC - Olivier Chicha: Sure,
let say that in each of the table of our DB we have a filed named "domainName"
let say we have in our DataBase a Table named UserContact, then our table topic
"table/UserContact" would receive an event ("propertyChangeEvent") for each
change performed in the table (i.e. if we update 2 fields of a row of the table
we generate 2 events)
Now let say that we have an enterprise Acme, then our topic "domain/Acme" would
receive an event each time a row, for which domainName = acme, is modified
----
2018-12-03 10:43:48 UTC - Chris Miller: I've seen similar errors when the
namespace doesn't exist. What does `pulsar-admin namespaces list diagnostics`
return? If the namespace is there, what does `pulsar-admin namespaces policies
diagnostics/local.diagnostics.guestnamespace` return?
----
2018-12-03 11:02:12 UTC - Christophe Bornet: Hi all, can Pulsar brokers be
aware of the rack placement of Bookies to perform reads on the closest Bookie ?
----
2018-12-03 11:10:06 UTC - jia zhai: @Christophe Bornet This is not supported
yet.
----
2018-12-03 11:11:07 UTC - Christophe Bornet: Does this mean you intend to
support it someday ?
----
2018-12-03 12:31:07 UTC - Ivan Kelly: how are you generating events from the
table? tailing the journal or something?
----
2018-12-03 12:48:13 UTC - Olivier Chicha: We have a kind of in house proxy that
allows us to control all the change requests on the DB
----
2018-12-03 14:05:33 UTC - Bogdan BUNECI: Hi ! Just a small question: While
trying to use json schema from
<http://json-schema.org/draft-07/schema|json-schema.org/draft-07/schema> we
received invalid schema. What is the meaning of “type”: [“JSON”,“AVRO”] ?
----
2018-12-03 14:21:53 UTC - Ivan Kelly: Does this come with some sort of
monotonically increasing number?
----
2018-12-03 14:22:38 UTC - Ivan Kelly: it seems to me like you should create two
events for each change. the only issue is what happens in failure cases
----
2018-12-03 14:22:56 UTC - Ivan Kelly: I assume clients are consuming either a
domain topic or a table topic, but not both?
----
2018-12-03 14:24:12 UTC - Ivan Kelly: AVRO is a type of schema. how are you
passing in your json schema?
----
2018-12-03 14:27:15 UTC - Bogdan BUNECI: I’ve tested with a simple AVRO schema
and is working. I’m trying to use JSON schema.
----
2018-12-03 14:27:35 UTC - Bogdan BUNECI: Schema is uploaded with pulsar-admin
schemas upload …
----
2018-12-03 14:31:18 UTC - Bogdan BUNECI: one json per line
----
2018-12-03 14:56:44 UTC - Bogdan BUNECI: I guess I should test with some
records not from the console :slightly_smiling_face:
----
2018-12-03 15:18:24 UTC - Bogdan BUNECI: working very well with AVRO. Records
produced with Apache NiFi.
----
2018-12-03 15:18:56 UTC - Bogdan BUNECI: Thanks !
----
2018-12-03 15:24:27 UTC - Yifan: Hi, I am here with another general question:)
my system is currently very light, I maybe processing only a few thousands
message per topic per hour, I have probably 10-20 topics. Is there any
suggested configuration for this setup? I don’t want to use more resources than
needed. Currently I can see in values.yaml (deployment/kubernetes/helm), memory
for zookeeper is 15G, for example. 4G for grafana, which to me is a lot. Are
there general guidelines for memory and cpu configuration for pulsar cluster?
----
2018-12-03 16:31:16 UTC - 东东: @东东 has joined the channel
----
2018-12-03 17:01:39 UTC - Christophe Bornet: I'm trying to understand what is
the purpose of the "discovery service". It seems its role is to get the list of
active brokers from ZK that clients can lookup. But it seems to me that
connecting a client directly to a broker gives about the same functionality.
What do I miss ?
----
2018-12-03 17:45:13 UTC - Matteo Merli: It’s not very useful in practice.
When you expose a Pulsar service, you need to just expose 1 single hostname/IP
to clients. There are several ways to do that. eg.
* DNS cname with multiple IPs
* VIP / Load balancer
* Scheduler specific discovery service (eg: service DNS in Kubernetes)
* …
This was an attempt to create a simple discovery service module, but the above
alternatives are preferable.
----
2018-12-03 17:53:38 UTC - Matteo Merli: @Yifan You can certanily scale down the
memory settings a lot.
For CPU you can check at the usage when running your particular workload and
plan accordingly.
For memory, at your rate, you can reduce a lot from the defaults. 1GB (or even
512MB) should be enough for any of the components.
The only things to be careful are the sizes of the caches, in broker and
bookies. There are few settings that need to be scaled down with the configured
memory. All of them are relative to direct memory.
* `broker.conf`
- `managedLedgerCacheSizeMB=1024` -> 64
* `bookkeeper.conf`:
- `dbStorage_writeCacheMaxSizeMb=512` -> 16
- `dbStorage_readAheadCacheMaxSizeMb=256` -> 0 (disable read cache)
- `dbStorage_rocksDB_blockCacheSize=268435456` -> 16777216 (16Mb)
This config options will be automated in next release 2.3 by having
default values tied to the -Xmx and max direct memory configured in JVM.
100 : Byron
----
2018-12-03 18:21:03 UTC - Christophe Bornet: Thanks. That's also my conclusion
: as the discovery service itself would be a SPOF you would need to have a
failover mecanism in front anyway which would probably be some kind of
VIP/LB/DNS, etc...
----
2018-12-03 18:21:38 UTC - Karthik Palanivelu: Team, I am trying to find a doc
on Synchronous Replication. I end up here -
<http://pulsar.apache.org/docs/en/administration-geo.html#docsNav>. Can you
please help me where I should look at?
----
2018-12-03 18:24:58 UTC - Christophe Bornet: For a VIP, what would be the best
healthcheck in your opinion ?
----
2018-12-03 18:26:58 UTC - Christophe Bornet: for failover
----
2018-12-03 18:28:12 UTC - Matteo Merli: The check on the VIP health should be
frequent and lightweight, in general.
Eg: you could hit <http://broker:8080/metrics>
----
2018-12-03 18:28:31 UTC - Matteo Merli: There is also a handler called
`/status.html`
----
2018-12-03 18:28:48 UTC - Matteo Merli: (that’s what was used by Yahoo’s
hardware VIPs)
----
2018-12-03 18:30:28 UTC - Matteo Merli: This handler will either respond 200 or
404 depending on wether a file exists on the broker disk.
The path of that file is configured in `broker.conf` with :
```
statusFilePath=/xxx
```
This can be used to take a broker out of VIP rotation while the process is
running
----
2018-12-03 18:52:18 UTC - Yifan: Thanks. I am not a Java person. What should I
use for Xmx and Max direct memory in JVM configuration? something like 64MB?
----
2018-12-03 18:59:02 UTC - Karthik Palanivelu: Team, I am having a Cluster A and
B trying to use the Same ZooKeeper. It is failing on creating metadata with
below error. I am trying to test Synchronous Replication with Same Zookeeper
shared across two Clusters. I am trying to use the same ZK instance as local
and global zk. If I start a Global ZK on the same instance, I am getting the
wrong number of arguments error. Please advise on how to achieve it.
```
Exception in thread "main"
org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode =
NodeExists for /namespace
at org.apache.zookeeper.KeeperException.create(KeeperException.java:122)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:792)
at
org.apache.pulsar.PulsarClusterMetadataSetup.main(PulsarClusterMetadataSetup.java:156)
```
----
2018-12-03 19:01:16 UTC - Matteo Merli: I’d start with 256M or 512M and then
check the mem usage
----
2018-12-03 19:04:47 UTC - Yifan: Okay, thanks.
----
2018-12-03 19:24:12 UTC - Olivier Chicha: There is effectively an incremental
change Id.
the events are only sent in case of success, else it means that no change were
commited to the DB
yes client are consuming one or the other but not both
For now I agree with you that staying on the 2 events seems to be the best
option for the first version.
----
2018-12-03 19:30:25 UTC - Ivan Kelly: ok. the reason I asked about the
monotonically increasing number is so that you can use idempotent publish for
failure scenarios where the producer crashes
----
2018-12-04 07:15:23 UTC - fvelement: @fvelement has joined the channel
----
2018-12-04 08:25:59 UTC - Christophe Bornet: OK. I'll try that. About failing
over to another region, how do I do that ? I would like to be able to failover
the producers and consumers independently. They could be pointing to distinct
VIPs but it seems that the VIP is only used for the first connection and after
that the clients communicate directly with the broker. So how would I force the
producers and consumers to reconnect and use the new VIP ?
----