2020-09-03 10:18:55 UTC - Enrico Olivelli: Probably you have additional third
party dependencies not compatible with current version?
----
2020-09-03 12:36:52 UTC - Marc Porst: @Marc Porst has joined the channel
----
2020-09-03 13:14:07 UTC - Takahiro Hozumi: Thank you for your information. I
will try it.
----
2020-09-03 13:17:22 UTC - Raghav: Addison - I got this error on the client.
There were no errors reported in the broker logs at that point of time. I saw
this while I was trying to check the max number of topics a broker can support.
I started 1000 consumers on 1000 topics(non-partitioned) and there was no error
on the consumers. When I started to produce on the client I kept seeing this
error. I tried with multiple number of topics as low as 1, but still it was the
same error. When I stopped the consumers then it started to work. (Tried
produce on different namespaces as well)
Do lookup requests happen in background for the topics that are not active as
well (no produce and no consume). There were many other inactive topics on the
brokers at the time I was testing(maybe approx around 8k)
----
2020-09-03 13:30:29 UTC - Takahiro Hozumi: I've switched client library from
`org.apache.pulsar/pulsar-client:2.5.2`
to `org.apache.pulsar/pulsar-client-original:2.5.2` and got the attached error.
The code is the following, which worked with `pulsar-client:2.5.2` .
```(scala code)
val pClient: PulsarClient =
PulsarClient.builder().serviceUrl("<pulsar://localhost:6650>").build()
val pProducer: Producer[GenericData.Record] = .newProducer(mySchema)
.enableBatching(true)
.compressionType(CompressionType.ZSTD)
.producerName("myproducer1")
.topic("<persistent://mytenant/mynamespace/mytopic>"))
.create()
val key: String = ...
val r: GenericData.Record = ...
val f: CompletableFuture[MessageId] = pProducer
.newMessage()
.key(key)
.value(r)
.sendAsync()```
What is the cause of the exception?
----
2020-09-03 14:34:13 UTC - Ravi Shah: Failed to create producer: Namespace is
being unloaded
----
2020-09-03 14:34:21 UTC - Ravi Shah: Any one have idea on this/
----
2020-09-03 14:34:21 UTC - Ravi Shah: ?
----
2020-09-03 14:34:31 UTC - Ravi Shah: when it comes and why it comes?
----
2020-09-03 14:46:47 UTC - Addison Higham: This error occurs periodically when
Pulsar is balancing load across brokers. It does this by "unloading" a
namespace bundle so that the bundle can be assigned to another broker.
It should be transparent in most cases and only last for a few seconds. If it
persists, it is likely an issue either with your cluster or a bug in Pulsar.
There has been numerous fixes in the latest releases of Pulsar that should
eliminate that issue, but if you are seeing it on latest 2.6.1, please let us
know!
----
2020-09-03 14:52:10 UTC - Ravi Shah: Thanks for the response. We are on 2.4.2
----
2020-09-03 15:01:21 UTC - Addison Higham: are you seeing it just momentarily?
or does it persist? if it is persisting, upgrading will certainly help
----
2020-09-03 15:01:49 UTC - Ravi Shah: This is we are seeing from last two days
----
2020-09-03 17:04:28 UTC - Nicolas Lelouche: @Nicolas Lelouche has joined the
channel
----
2020-09-03 18:11:52 UTC - Nicolas Lelouche: hey all! Quick question: I am using
pulsar-spark-connector in python and pyspark to send in structured pulsar
streams to a spark application, but I feel like there is bizarre behavior.
In the code posted below, the spark-submit command used to start the spark
fails with a NoSuchElement exception if the createDataFrame line is removed.
Does anyone know what could cause this behavior? Huge thanks!
----
2020-09-03 18:11:59 UTC - Nicolas Lelouche: ```import sys
from pyspark.sql import SparkSession
if __name__ == "__main__":
if len(sys.argv) != 4:
print("""
Usage: TestPyspark.py <service_url> <admin_url>
<topics>
""", file=sys.stderr)
sys.exit(-1)
serviceUrl = sys.argv[1]
adminUrl = sys.argv[2]
topics = sys.argv[3]
spark = SparkSession\
.builder\
.appName("StructuredPulsarWordCount")\
.getOrCreate()
dt = spark.createDataFrame([(1, 1.0), (1, 2.0), (2, 3.0), (2, 5.0), (2,
10.0), (5, 12.0)], ("id", "v"))
# Create DataSet representing the stream of input lines from pulsar
lines = spark\
.readStream\
.format("pulsar")\
.option("service.url", serviceUrl)\
.option("admin.url", adminUrl)\
.option("topics", topics)\
.load()\
.selectExpr("CAST(value AS STRING)")
query = lines\
.writeStream\
.outputMode('append')\
.format('console')\
.start()
query.awaitTermination()```
----
2020-09-03 19:22:27 UTC - Stepan Mazurov: Has there been any effort in
automating tenant/namespace configuration? I know there is a REST api, but I
was thinking higher level, ansible, terraform, something
----
2020-09-03 19:40:35 UTC - Evan Furman: @Sijie Guo do we need to explicitly
disable namespace metrics in order to get topic metrics?
----
2020-09-03 19:43:50 UTC - Thomas O'Neill: A terraform provider would be cool
----
2020-09-03 20:20:58 UTC - Evan Furman: @Addison Higham it looks like a bug in
`2.6.0` `pulsar-perf`. I only had to run for about 2-3 minutes to reproduce the
issue. However, when running `2.6.1` I was not able to reproduce.
----
2020-09-03 20:29:52 UTC - Addison Higham: interesting... I am curious what the
exact problem would be there
----
2020-09-03 20:35:13 UTC - Addison Higham: FYI, it isn't in pulsar-perf itself
but a bug on the client in general (not sure quite which one yet...), so you
will want to make sure to use 2.6.1 for your clients
----
2020-09-03 20:36:17 UTC - Evan Furman: Alright, will do.
----
2020-09-03 21:34:13 UTC - Sijie Guo: No I don’t think you need to disable
namespace metrics.
----
2020-09-03 21:35:31 UTC - Sijie Guo: Can you curl the `/metrics` endpoint on
broker? If I can get a copy of the metrics, I can help figure out the problem.
----
2020-09-03 21:36:12 UTC - Evan Furman: I had to tear down the cluster but i’ll
post one here when i have it
----
2020-09-03 22:36:33 UTC - Yarden Arane: *MESSAGE RETENTION*: Topic with no
subscriptions
Hi all, consider the following use case:
• A topic, _my_topic,_ has no current subscriptions
```./pulsar-admin topics stats public/default/my_topic
{
"msgRateIn" : 0.0,
"msgThroughputIn" : 0.0,
"msgRateOut" : 0.0,
"msgThroughputOut" : 0.0,
"bytesInCounter" : 0,
"msgInCounter" : 0,
"bytesOutCounter" : 0,
"msgOutCounter" : 0,
"averageMsgSize" : 0.0,
"msgChunkPublished" : false,
"storageSize" : 0,
"backlogSize" : 0,
"publishers" : [ ],
"subscriptions" : { },
"replication" : { },
"deduplicationStatus" : "Disabled"
}```
• The retention policy for the topic's namespace is set to unlimited
```./pulsar-admin namespaces get-retention public/default
{
"retentionTimeInMinutes" : -1,
"retentionSizeInMB" : -1
}```
• Produce a message to _my_topic_
```./pulsar-admin topics stats public/default/my_topic
{
"msgRateIn" : 0.0,
"msgThroughputIn" : 0.0,
"msgRateOut" : 0.0,
"msgThroughputOut" : 0.0,
"bytesInCounter" : 583,
"msgInCounter" : 1,
"bytesOutCounter" : 0,
"msgOutCounter" : 0,
"averageMsgSize" : 0.0,
"msgChunkPublished" : false,
"storageSize" : 583,
"backlogSize" : 0,
"publishers" : [ ],
"subscriptions" : { },
"replication" : { },
"deduplicationStatus" : "Disabled"
}```
• Once a message is produced, create a consumer and subscribe to _my_topic_
```./pulsar-client consume -s name -n 0 public/default/my_topic```
*Desired outcome:* once a subscription is made, the message stored on the
broker should move to the backlog, and be subsequently consumed.
*Actual outcome:* message remains in storage and not moved to the backlog; thus
never seen by the consumer.
Am I missing something? *How do you trigger messages in storage which have
never been acknowledged to newly created subscriptions?*
----
2020-09-03 22:45:01 UTC - Ryan Nowacoski: subscriptionInitialPosition is set to
Latest by default so it won't read any messages written before the consumer
starts. set it to Earliest to get the behavior you are looking for
----
2020-09-03 23:08:21 UTC - Sijie Guo: ok
----
2020-09-03 23:57:08 UTC - Addison Higham: I saw you opened an issue, thank you,
that does seem weird an unexpected :confused:
----