2020-02-25 09:24:44 UTC - igor dubrowsky: @igor dubrowsky has joined the channel ---- 2020-02-25 09:49:38 UTC - Sijie Guo: it is related a shading issue. but it is a warning. it doesn’t impact correctness. also this class is only used for function package storage so it is not in the critical path.
We will fix it in the future release. ---- 2020-02-25 12:25:07 UTC - Rolf Arne Corneliussen: *Iot* use case: *getting messages from Pulsar to devices*. Let us say we have an array of stateless gateways that Iot devices connect to. Passing on messages from devices to a Pulsar topics is easy, but how to get messages from Pulsar to devices? A naive approach is to have a topic for each device that gateways can subscribe to when a device connects. This may work well for a small number of devices, however, if we have 1M + devices, will it be a good idea? I have read about support for 1M + topics, but is it trivial to create that number of topics, or does it require extra hardware, zookeepers, bookies and brokers? ---- 2020-02-25 14:13:55 UTC - Justin Grimes: No, but there is an operator channel here and I think part of their objective is to not only support geo repl but also creating topics, namespaces etc, via CRDs ---- 2020-02-25 14:23:52 UTC - eyal leshem: @eyal leshem has joined the channel ---- 2020-02-25 14:35:01 UTC - Ming: @Jon Bennett Here is an example that a Producer message uses Key. We just migrated from cgo to native go client library <https://github.com/kafkaesque-io/pulsar-beam/blob/master/src/db/pulsardb.go#L241> ---- 2020-02-25 14:36:37 UTC - Ming: There are some differences between the two libraries. But the docker image has gone down significantly. The docker image based on the new native lib is only 28MB in our case. ---- 2020-02-25 15:04:52 UTC - Jon Bennett: @Ming thanks! ---- 2020-02-25 15:05:05 UTC - Jon Bennett: @Ming I’ll give your project a good look over ---- 2020-02-25 15:06:20 UTC - Jon Bennett: @Ming that file looks to be using the cpp client… <https://github.com/kafkaesque-io/pulsar-beam/blob/master/src/db/pulsardb.go#L10> ---- 2020-02-25 15:08:10 UTC - Jon Bennett: sorry, I see, almost the same import path! ---- 2020-02-25 15:08:12 UTC - Jon Bennett: <https://github.com/apache/pulsar-client-go> ---- 2020-02-25 15:08:13 UTC - Jon Bennett: thanks! ---- 2020-02-25 15:47:08 UTC - Graham: @Graham has joined the channel ---- 2020-02-25 15:58:58 UTC - Fredrick P Eisele: I want to add keys or properties to existing messages. Is this possible? If so how? ---- 2020-02-25 16:12:00 UTC - John Duffie: @John Duffie has joined the channel ---- 2020-02-25 16:27:34 UTC - Ming: No problem. Let me know if you have any questions or any issues about the new native go library. ---- 2020-02-25 17:30:49 UTC - Jon Bennett: will do, working without significant changes in our prototype. ---- 2020-02-25 18:02:57 UTC - Sijie Guo: the cost of 1M+ topics is mostly on metadata. so you might need to plan the resources for zookeeper and potentially also for brokers (memory for topics and subscriptions, as well as for connections). ---- 2020-02-25 18:04:58 UTC - Sijie Guo: you can add `key` and `properties` when sending a message. <http://pulsar.apache.org/api/client/2.5.0-SNAPSHOT/org/apache/pulsar/client/api/TypedMessageBuilder.html#property-java.lang.String-java.lang.String-> <http://pulsar.apache.org/api/client/2.5.0-SNAPSHOT/org/apache/pulsar/client/api/TypedMessageBuilder.html#key-java.lang.String-> ---- 2020-02-25 19:57:26 UTC - John Duffie: Hello. We are evaluating pulsar to use in place of kafka. We’ve hit a roadblock when it comes to Avro support. We have some Avro objects that contain an array. That array is defined as a union of complex objects. With pulsar and its schema registry, we haven’t been successful. A sniffer trace on the producer side shows the schema being pushed supports embedded complex objects but not the scenario I mention above with the array. We’ve been successful in the past with Confluent SR but have yet to find a solution to this case. Anyone have suggestions? ---- 2020-02-25 20:09:50 UTC - Sijie Guo: @John Duffie Do you mind showing me an example about the error you hit when using Avro objects that contains an array? So we can look into how to address the problem for you. ---- 2020-02-25 20:12:14 UTC - John Duffie: certainly - I’ll simplify the object and then send the corresponding schemas that are posted the respective registry ---- 2020-02-25 20:20:38 UTC - Sijie Guo: cool. If you can share the errors you encountered, that would be great as well. ---- 2020-02-25 20:43:03 UTC - matt_innerspace.io: @matt_innerspace.io has joined the channel ---- 2020-02-25 21:37:33 UTC - Greg Gallagher: anyone using filebeat (or similar) to send to Pulse? I'm trying to figure out, can you use the same output configuration as kafka or .. ? any pointers would be welcome ---- 2020-02-25 22:16:23 UTC - Sijie Guo: Ah replied your question in <#CJAH1G25U|elasticsearch> channel ---- 2020-02-25 23:12:15 UTC - Eugen: A question about ordering guarantees in the face of broker failures - when using `sendAsync()` and a broker failover occurs, do we risk losing messages or having them added to the topic in a different order, just like is the case with Kafka? Or is the Pulsar producer able to handle this without loss / different order? ---- 2020-02-25 23:12:53 UTC - Matteo Merli: Yes, the client will make sure to replay all the pending messages in order after the reconnection ---- 2020-02-25 23:13:45 UTC - Eugen: Thanks.. In the meantime, I've figured that Kafka's "idempotent producer" offers the same. (Except that it only works for the lifetime of a single producer, in contrast to Pulsar, where the producer can set the sequence id) ---- 2020-02-26 02:31:09 UTC - PJ: @PJ has joined the channel ---- 2020-02-26 02:37:41 UTC - PJ: How do people manage (from a code/devops/release-lifecycle perspective) running a service with pulsar at its core? I've got a web frontend that produces and consumes data for/from pulsar. Pulsar thus has several Functions to process and produce the data for the frontend. It seems natural to keep them in the same repo as the app... but then - given that devops is in a completely different repo - how do they get deployed, and when? ---- 2020-02-26 02:39:15 UTC - PJ: I feel like there's a release-lifecycle/version-mismatch problem lurking if I'm not careful. ---- 2020-02-26 04:16:44 UTC - Justin Grimes: I would tend to keep the code that's for the same bounded context/domain/service or whatever hipster term we use now in the same repo for developer simplicity. A lot of CI tools give you the flexibility on how you publish artifacts such that some other release pipeline(or an extension of your build pipeline) can create or update the function with the pulsar admin tools with appropriate creds. If the pulsar function is compiled to a jar, publish the jar, not that much different conceptually than publishing a docker image and running helm/kubectl to apply the change to a cluster downstream ---- 2020-02-26 06:05:06 UTC - Ken Huang: Can Pulsar SQL REST APIs pass the JWT token for authentication? ---- 2020-02-26 08:10:16 UTC - Antti Kaikkonen: Is it possible to change ledger and journal directories when running pulsar standalone? I have set ```ledgerDirectories=/home/anttkaik/Downloads/apache-pulsar-2.5.0-bin/ledger_dir indexDirectories=/home/anttkaik/Downloads/apache-pulsar-2.5.0-bin/index_dir journalDirectories=/home/anttkaik/Downloads/apache-pulsar-2.5.0-bin/journal_dir``` in bookkeeper.conf and standalone.conf but pulsar still uses the "data" directory in the pulsar directory. ---- 2020-02-26 08:10:46 UTC - Eugen: @Antti Kaikkonen try `journalDirectory` (singular) ---- 2020-02-26 08:15:56 UTC - Antti Kaikkonen: @Eugen I think you tagged the wrong person. I tried journalDirectory and that didn't help. +1 : Eugen ---- 2020-02-26 08:20:05 UTC - Antti Kaikkonen: Strangely they appear correctly in the output of `./pulsar standalone` ---- 2020-02-26 08:20:05 UTC - Antti Kaikkonen: ```10:14:35.808 [main] INFO org.apache.pulsar.zookeeper.LocalBookkeeperEnsemble - Starting Bookie(s) 10:14:35.838 [main] INFO org.apache.bookkeeper.proto.BookieServer - { "loadBalancerSheddingGracePeriodMinutes" : "30", "backlogQuotaCheckIntervalInSeconds" : "60", "loadBalancerNamespaceBundleMaxBandwidthMbytes" : "100", "bookkeeperClientTimeoutInSeconds" : "30", "exposeTopicLevelMetricsInPrometheus" : "true", "managedLedgerCacheEvictionFrequency" : "100.0", "managedLedgerCursorBackloggedThreshold" : "1000", "autoSkipNonRecoverableData" : "false", "brokerPublisherThrottlingMaxByteRate" : "0", "bookkeeperClientHealthCheckErrorThresholdPerInterval" : "5", "bookkeeperClientHealthCheckEnabled" : "true", "brokerDeduplicationEntriesInterval" : "1000", "authenticateOriginalAuthData" : "false", "anonymousUserRole" : "", "dbStorage_rocksDB_numFilesInLevel0" : "4", "brokerDeleteInactiveTopicsFrequencySeconds" : "60", "dbStorage_rocksDB_numLevels" : "-1", "managedLedgerDigestType" : "CRC32C", "managedLedgerReadEntryTimeoutSeconds" : "0", "loadBalancerBrokerMaxTopics" : "50000", "loadBalancerAutoBundleSplitEnabled" : "true", "managedLedgerAddEntryTimeoutSeconds" : "0", "bookkeeperClientAuthenticationParametersName" : "", "managedLedgerDefaultAckQuorum" : "1", "managedLedgerMaxEntriesPerLedger" : "50000", "loadBalancerNamespaceMaximumBundles" : "128", "managedLedgerCacheCopyEntries" : "false", "webSocketConnectionsPerBroker" : "8", "journalSyncData" : "false", "zkServers" : "127.0.0.1:2181", "dbStorage_rocksDB_maxSizeInLevel1MB" : "256", "brokerServicePort" : "6650", "loadBalancerReportUpdateMaxIntervalMinutes" : "15", "bindAddress" : "0.0.0.0", "transactionMetadataStoreProviderClassName" : "org.apache.pulsar.transaction.coordinator.impl.InMemTransactionMetadataStoreProvider", "managedLedgerNumSchedulerThreads" : "4", "allowEphemeralPorts" : "true", "clientLibraryVersionCheckEnabled" : "false", "tokenAuthClaim" : "", "maxProducersPerTopic" : "0", "subscriptionExpirationTimeMinutes" : "0", "bookkeeperClientReorderReadSequenceEnabled" : "false", "numWorkerThreadsForNonPersistentTopic" : "8", "dbStorage_rocksDB_blockCacheSize" : "", "maxConcurrentNonPersistentMessagePerConnection" : "1000", "brokerShutdownTimeoutMs" : "60000", "maxConsumersPerSubscription" : "0", "bookkeeperClientAuthenticationParameters" : "", "authenticationEnabled" : "false", "numIOThreads" : "", "allocatorPoolingPolicy" : "UnpooledHeap", "maxConsumersPerTopic" : "0", "managedLedgerMinLedgerRolloverTimeMinutes" : "10", "bookkeeperTLSTrustCertsFilePath" : "", "clusterName" : "standalone", "superUserRoles" : "", "authenticationProviders" : "", "subscriptionRedeliveryTrackerEnabled" : "true", "dispatchThrottlingRatePerTopicInByte" : "0", "dispatchThrottlingRateRelativeToPublishRate" : "false", "managedLedgerCursorRolloverTimeInSeconds" : "14400", "globalZookeeperServers" : "", "defaultNumberOfNamespaceBundles" : "4", "loadBalancerEnabled" : "false", "dbStorage_readAheadCacheMaxSizeMb" : "", "activeConsumerFailoverDelayTimeMillis" : "1000", "managedLedgerDefaultEnsembleSize" : "1", "dbStorage_readAheadCacheBatchSize" : "1000", "authorizationProvider" : "org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider", "zookeeperServers" : "", "bookiePort" : "3181", "defaultRetentionSizeInMB" : "-1", "defaultNumPartitions" : "1", "managedLedgerCacheSizeMB" : "", "advertisedAddress" : "127.0.0.1", "bookkeeperClientRegionawarePolicyEnabled" : "false", "brokerDeduplicationEnabled" : "true", "bookkeeperTLSProviderFactoryClass" : "org.apache.bookkeeper.tls.TLSContextFactory", "topicPublisherThrottlingTickTimeMillis" : "2", "dispatchThrottlingOnNonBacklogConsumerEnabled" : "true", "bookkeeperTLSTrustCertTypes" : "PEM", "webSocketServiceEnabled" : "true", "bookkeeperTLSCertificateFilePath" : "", "numHttpServerThreads" : "", "maxConcurrentLookupRequest" : "50000", "bookkeeperTLSKeyStorePasswordPath" : "", "managedLedgerCacheEvictionWatermark" : "0.9", "defaultRetentionTimeInMinutes" : "-1", "brokerPublisherThrottlingMaxMessageRate" : "0", "loadBalancerNamespaceBundleMaxMsgRate" : "30000", "bookkeeperClientHealthCheckIntervalSeconds" : "60", "replicationProducerQueueSize" : "1000", "loadBalancerNamespaceBundleMaxSessions" : "1000", "webSocketSessionIdleTimeoutMillis" : "300000", "allowAutoTopicCreationType" : "non-partitioned", "managedLedgerDefaultWriteQuorum" : "1", "zooKeeperSessionTimeoutMillis" : "30000", "statusFilePath" : "/usr/local/apache/htdocs", "exposePublisherStats" : "true", "bookkeeperClientSecondaryIsolationGroups" : "", "maxUnackedMessagesPerConsumer" : "50000", "replicationConnectionsPerBroker" : "16", "brokerServicePurgeInactiveFrequencyInSeconds" : "60", "journalMaxGroupWaitMSec" : "1", "brokerPublisherThrPottlingTickTimeMillis" : "50", "bookkeeperTLSTrustStorePasswordPath" : "", "dbStorage_rocksDB_bloomFilterBitsPerKey" : "10", "managedLedgerMetadataOperationsTimeoutSeconds" : "60", "webServicePort" : "8080", "brokerDeduplicationMaxNumberOfProducers" : "10000", "ledgerStorageClass" : "org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage", "diskUsageWarnThreshold" : "0.99", "brokerDeduplicationProducerInactivityTimeoutMinutes" : "360", "gcWaitTime" : "300000", "enableNonPersistentTopics" : "true", "loadBalancerNamespaceBundleMaxTopics" : "1000", "bookkeeperClientAuthenticationPlugin" : "", "managedLedgerCursorMaxEntriesPerLedger" : "50000", "indexDirectories" : "/home/anttkaik/Downloads/apache-pulsar-2.5.0-bin/index_dir", "brokerClientAuthenticationPlugin" : "", "bookkeeperDiskWeightBasedPlacementEnabled" : "false", "bookkeeperClientMinAvailableBookiesInIsolationGroups" : "", "authorizationAllowWildcardsMatching" : "false", "athenzDomainNames" : "", "dbStorage_rocksDB_writeBufferSizeMB" : "4", "managedLedgerDefaultMarkDeleteRateLimit" : "0.1", "managedLedgerMaxLedgerRolloverTimeMinutes" : "240", "proxyRoles" : "", "subscriptionExpiryCheckIntervalInMinutes" : "5", "maxUnackedMessagesPerBroker" : "0", "keepAliveIntervalSeconds" : "30", "managedLedgerNumWorkerThreads" : "4", "flushInterval" : "60000", "brokerClientAuthenticationParameters" : "", "loadBalancerReportUpdateThresholdPercentage" : "10", "managedLedgerCacheEvictionTimeThresholdMillis" : "1000", "dbStorage_rocksDB_sstSizeInMB" : "4", "brokerDeleteInactiveTopicsEnabled" : "true", "journalDirectory" : "/home/anttkaik/Downloads/apache-pulsar-2.5.0-bin/journal_dir", "diskUsageThreshold" : "0.99", "bookkeeperClientRackawarePolicyEnabled" : "true", "replicationMetricsEnabled" : "true", "managedLedgerMaxUnackedRangesToPersist" : "10000", "maxConcurrentTopicLoadRequest" : "5000", "enablePersistentTopics" : "true", "bookkeeperClientSpeculativeReadTimeoutInMillis" : "0", "bookkeeperTLSKeyFileType" : "PEM", "loadBalancerAutoUnloadSplitBundlesEnabled" : "true", "ttlDurationDefaultInSeconds" : "0", "backlogQuotaDefaultLimitGB" : "10", "ledgerDirectories" : "data/standalone/bookkeeper0", "dbStorage_rocksDB_blockSize" : "4096", "zooKeeperOperationTimeoutSeconds" : "30", "journalDirectories" : "data/standalone/bookkeeper0", "bookkeeperTLSClientAuthentication" : "false", "bookkeeperTLSKeyFilePath" : "", "managedLedgerMaxUnackedRangesToPersistInZooKeeper" : "1000", "allowLoopback" : "true", "dispatchThrottlingRatePerTopicInMsg" : "0", "maxUnackedMessagesPerSubscription" : "200000", "backlogQuotaCheckEnabled" : "true", "allowAutoTopicCreation" : "true", "failureDomainsEnabled" : "false", "bookkeeperClientIsolationGroups" : "", "managedLedgerUnackedRangesOpenCacheSetEnabled" : "true", "webSocketNumIoThreads" : "8", "configurationStoreServers" : "", "maxUnackedMessagesPerSubscriptionOnBrokerBlocked" : "0.16", "dbStorage_writeCacheMaxSizeMb" : "", "messageExpiryCheckIntervalInMinutes" : "5", "authorizationEnabled" : "false", "loadBalancerSheddingIntervalMinutes" : "1", "loadBalancerHostUsageCheckIntervalMinutes" : "1", "loadManagerClassName" : "org.apache.pulsar.broker.loadbalance.NoopLoadManager", "bookkeeperClientHealthCheckQuarantineTimeInSeconds" : "1800", "loadBalancerResourceQuotaUpdateIntervalMinutes" : "15" }``` ---- 2020-02-26 08:25:35 UTC - Antti Kaikkonen: Actually above also contains ```"ledgerDirectories" : "data/standalone/bookkeeper0" "journalDirectories" : "data/standalone/bookkeeper0"``` which replace my configuration. ---- 2020-02-26 08:39:54 UTC - Alex Yaroslavsky: @Sijie Guo <https://github.com/apache/pulsar/issues/6425> Thanks! ---- 2020-02-26 08:41:25 UTC - Alex Yaroslavsky: Cool, I will test this soon. Just all the documentation I could find always mention same tenant and same namespace and the command line allows only one. ---- 2020-02-26 08:46:04 UTC - Sijie Guo: @Antti Kaikkonen: <http://standalone.co|standalone> doesn’t take the bookkeeper configuration. You can change bookie directory by specifying `--bookkeeper-dir` when you start standalone. ---- 2020-02-26 08:46:14 UTC - Sijie Guo: `bin/pulsar standalone --bookkeeper-dir` ---- 2020-02-26 09:09:42 UTC - Antti Kaikkonen: Thanks. Is it possible to set a separate directory for journal and ledger? ----
