Slack digest for #general - 2020-06-19

Apache Pulsar Slack Fri, 19 Jun 2020 02:12:17 -0700

2020-06-18 09:14:48 UTC - Pedro Cardoso: Will the talk be recorded?
----
2020-06-18 09:15:09 UTC - Pushkar Sawant: The bookie still has 31468 ledgers on 
it
----
2020-06-18 09:51:58 UTC - Gilles Barbier: Hi! Apparently Pulsar-Manager uses a 
HerdDB database using Pulsar 's bookeeper and zookeeper. Does it mean that we 
could use this database and fill it with Pulsar's data using a JDBC sink 
connector right away with the right setup? poke @Enrico Olivelli . Any link 
describing how to do it?
----
2020-06-18 10:31:16 UTC - Enrico Olivelli: Latest released version of 
PulsarManager comes with HerdDB and you can use it in standalone mode.
PulsarManager itself it is not a distributed application so there is no way to 
use HerdDB in embedded flavour + replication because you will have a replicated 
WAL on Bookkeeper but only 1 herddb node.


You can still setup an HerdDB cluster that uses Pulsar Bookies and Pulsar 
Zookeeper. Then you configure PulsarManager to connect  to that HerdDB cluster.

For next version of PulsarManager we will upgrade HerdDB dependency to 0.16.0 
that supports diskless-cluster mode, that is to store the whole dataset (wal + 
data pages) on Bookkeeper.
This way the DB will be fully stored and replicated on existing Bookies and 
Zookeeper servers, no need for local persistent storage on PulsarManager 
machine/pod and you will get the benefits of high availability.

If you are interested in trying it out I will send a PR soon for the upgrade of 
PM to 0.16 .
----
2020-06-18 10:34:59 UTC - Gilles Barbier: Thx @Enrico Olivelli, is there some 
documentation somewhere describing how to setup a Herdb cluster using Pulsar 
bookkeeper/zookeeper ?
----
2020-06-18 10:36:20 UTC - Gilles Barbier: (Great for the 0.16 )
----
2020-06-18 10:37:44 UTC - Enrico Olivelli: 
<https://medium.com/streamnative/how-to-use-apache-pulsar-manager-with-herddb-dd265c955ca4>
----
2020-06-18 10:37:53 UTC - Enrico Olivelli: Hope that helps.
----
2020-06-18 10:38:01 UTC - Enrico Olivelli: Are you running on k8s / docker ?
----
2020-06-18 10:39:23 UTC - Enrico Olivelli: It is better to try it out in 
dev/staging env before doing it in production :slightly_smiling_face:
----
2020-06-18 10:40:04 UTC - Gilles Barbier: We are developing on a docker 
standalone currently. Not yet in production 
----
2020-06-18 10:42:30 UTC - Enrico Olivelli: I see.
Unfortunately we still do not provide a docker image for HerdDB.
But we will be happy to accept contributions .
Btw if you are going to use diskless cluster mode you will only have the 
container with PulsarManager no need for additional containers for HerdDB
----
2020-06-18 11:03:40 UTC - Pushkar Sawant: I am slowly loosing bookkeeper nodes. 
The number of underreplicated ledgers are going up. Right now down to 2 
available nodes
----
2020-06-18 11:21:08 UTC - Pushkar Sawant: Only thing i see in other bookie logs 
is 
org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: 
Bookie handle is not available.
----
2020-06-18 11:43:40 UTC - Pushkar Sawant: There are timeouts to 3 bookies have 
started but in an inactive state
----
2020-06-18 12:07:53 UTC - Matej Šaravanja: Hi, does anyone has problem with 
this? I'm deploying default helm chart to kubernetes cluster, pvcs are created 
just fine, but bookies are instantly crashing and restarting every 20-30 
seconds due to this:
```ERROR org.apache.bookkeeper.bookie.Bookie - There are directories without a 
cookie, and this is neither a new environment, nor is storage expansion 
enabled. Empty directories are [/pulsar/data/bookkeeper/journal/current, 
/pulsar/data/bookkeeper/ledgers/current]```
btw, disk is almost empty
----
2020-06-18 12:44:13 UTC - Matej Šaravanja: And how is this possible?
```12:27:10.456 [LedgerDirsMonitorThread] WARN  
org.apache.bookkeeper.bookie.LedgerDirsMonitor - LedgerDirsMonitor check 
process: All ledger directories are non writable
12:27:10.465 [LedgerDirsMonitorThread] ERROR 
org.apache.bookkeeper.util.DiskChecker - Space left on device 
/pulsar/data/bookkeeper/ledgers/current : 9223371978995400704, Used space 
fraction: 2.0 &gt; threshold 0.95.```
----
2020-06-18 12:54:34 UTC - Enrico Olivelli: Hey @Gilles Barbier please take a 
look to this PR, I hope it will be delivered with next version of Pulsar Manager
<https://github.com/apache/pulsar-manager/pull/303>
----
2020-06-18 12:54:56 UTC - Gilles Barbier: I will, thx @Enrico Olivelli
----
2020-06-18 13:09:58 UTC - Gilles Barbier: Indeed, the ability to run HerdDB on 
a stateless pod seems very appealing in this context :slightly_smiling_face:
----
2020-06-18 13:39:41 UTC - Enrico Olivelli: yep.
In my company (<http://emailsuccess.com|emailsuccess.com>) is not very useful, 
but in other contexts we saw it will ease a lot the deployment of simple 
applications like Pulsar Manager.
So we released this first version.
Any testing will be very appreciated :slightly_smiling_face:
Feel free to open issues on github <https://github.com/diennea/herddb>
for whatever you need or you find wrong
+1 : Gilles Barbier
----
2020-06-18 13:54:56 UTC - Penghui Li: Looks your disk almost full. You can try 
to check your topics have many backlogs or you have set data retention for some 
namespace.
----
2020-06-18 13:58:11 UTC - Matej Šaravanja: I've checked, my disk that 
bookkeeper is running on has usage of 1%
----
2020-06-18 13:58:28 UTC - Matej Šaravanja: and there are not topics, this is 
first time pulsar deploy
----
2020-06-18 13:59:12 UTC - Penghui Li: interesting
----
2020-06-18 13:59:23 UTC - Penghui Li: `2.0 &gt; threshold 0.95`
----
2020-06-18 13:59:49 UTC - Matej Šaravanja: yes, I'm aware of that, but don't 
understand how that could be true
----
2020-06-18 13:59:58 UTC - Matej Šaravanja: I've set 100gb for both ledgers and 
journal
----
2020-06-18 14:00:18 UTC - Matej Šaravanja: is there some hidden variable that 
causes problem when creating new ledger or smth like that?
----
2020-06-18 14:01:42 UTC - Penghui Li: ```float checkDiskFull(File dir) throws 
DiskOutOfSpaceException, DiskWarnThresholdException {
        if (null == dir) {
            return 0f;
        }
        if (dir.exists()) {
            long usableSpace = dir.getUsableSpace();
            long totalSpace = dir.getTotalSpace();
            float free = (float) usableSpace / (float) totalSpace;
            float used = 1f - free;
            if (used &gt; diskUsageThreshold) {
                LOG.error("Space left on device {} : {}, Used space fraction: 
{} &gt; threshold {}.",
                        dir, usableSpace, used, diskUsageThreshold);
                throw new DiskOutOfSpaceException("Space left on device "
                        + usableSpace + " Used space fraction:" + used + " &gt; 
threshold " + diskUsageThreshold, used);
            }
            // Warn should be triggered only if disk usage threshold doesn't 
trigger first.
            if (used &gt; diskUsageWarnThreshold) {
                LOG.warn("Space left on device {} : {}, Used space fraction: {} 
&gt; WarnThreshold {}.",
                        dir, usableSpace, used, diskUsageWarnThreshold);
                throw new DiskWarnThresholdException("Space left on device:"
                        + usableSpace + " Used space fraction:" + used + " &gt; 
WarnThreshold:" + diskUsageWarnThreshold,
                        used);
            }
            return used;
        } else {
            return checkDiskFull(dir.getParentFile());
        }
    }```
----
2020-06-18 14:03:18 UTC - Penghui Li: looks `free`  got `-1`
----
2020-06-18 14:04:03 UTC - Matej Šaravanja: I'll check persistent volume claims 
once more
----
2020-06-18 14:04:04 UTC - Penghui Li: Which java version are you using?
----
2020-06-18 14:04:41 UTC - Penghui Li: Ok
----
2020-06-18 14:05:35 UTC - Matej Šaravanja: Funny thing happened. While we were 
discussing this, I've deployed it to another namespace and now everything seems 
to work fine
+1 : Penghui Li
----
2020-06-18 14:05:49 UTC - Matej Šaravanja: thanks for the help, I'll try this 
and keep you posted :slightly_smiling_face:
----
2020-06-18 14:07:15 UTC - Alex Yaroslavsky: Hi,
----
2020-06-18 14:08:44 UTC - Alex Yaroslavsky: A question, running Pulsar 2.5.2 in 
EKS, with two function workers (StatefulSet). After restarting the workers, 95% 
of functions are running on the first one. Shouldn't there be some balancing of 
functions between workers?
----
2020-06-18 14:11:35 UTC - Matej Šaravanja: And about java version:

openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-b09)
OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)
----
2020-06-18 14:20:29 UTC - Penghui Li: 2.6.0 released. there are several fixes 
and enhancement on the KeyShared subscription. @Ankur Jain you can check the 
release note here <http://pulsar.apache.org/release-notes/#2.6.0> and download 
from <http://pulsar.apache.org/en/download/>
tada : Konstantinos Papalias
+1 : Ankur Jain
----
2020-06-18 14:34:44 UTC - Konstantinos Papalias: nice one thanks for sharing 
@Penghui Li, it may worth sharing on main channel as well!
----
2020-06-18 15:14:12 UTC - David Kjerrumgaard: Error 1 looks like a DNS problem, 
where the Pulsar connector cannot resolve the URL of the Kinesis source
+1 : Renault
----
2020-06-18 15:44:09 UTC - Alexander Ursu: Hi, was just taking a look at the 
REST API 
(<http://pulsar.apache.org/admin-rest-api/?version=2.5.2&amp;apiversion=v2#operation/grantPermissionOnNamespace>)
for granting permissions on a namespace specifically, how might the desired 
actions be passed in?
for reference I'm doing this in Python 3
```r = <http://requests.post|requests.post>(base_url + 
f"namespaces/{ns}/permissions/my-role", data="consume", headers=headers)```
this doesn't seem to have any effect
is there also any chance that the documentation can be improved to cover what 
and how stuff is passed in the request body?
----
2020-06-18 15:46:54 UTC - Matt Mitchell: Are there any test utilities available 
for java? Specifically, I’d like to test interactions with PulsarClient + 
consumers/producers. I was going down the path of just mocking everything via 
mockito, but hoping there’s something already pre-baked or an alternative 
similar to zookeepers testing/embedded server?
----
2020-06-18 15:47:48 UTC - Pushkar Sawant: The entire cluster failed. Had to 
rebuild the cluster
----
2020-06-18 15:48:28 UTC - Pushkar Sawant: Second time this has happened. When 
one of the bookie has an issue and is in recovery, rest of the cluster crashes
----
2020-06-18 16:01:27 UTC - Ebere Abanonu: @Sijie Guo @Penghui Li why is 
this:`public void getMessageByID` Am particular about the void, isn't it meant 
to return a message? The REST API have couple of other methods with void.
----
2020-06-18 16:12:06 UTC - Sijie Guo: There is a setting in broker to delete 
inactive topic. If there are no producers, consumers and subscriptions, broker 
will treat a topic as inactive and delete it after a certain time. 
----
2020-06-18 16:12:43 UTC - Jesse Anderson: yes
----
2020-06-18 16:12:49 UTC - Sijie Guo: In case B, there is a subscription created 
by the function. So the topic is not inactive 
----
2020-06-18 16:13:29 UTC - Pavels Sisojevs: In case B, there is a producer 
created by the function. Can I stop it somehow?
----
2020-06-18 16:13:44 UTC - Sijie Guo: Are you reusing any existing pvcs?
----
2020-06-18 16:37:42 UTC - Prashanth Tirupachur Vasanthakrishnan: This helped me 
get a response using Python requests:

```ns = "public/default"
r = <http://requests.post|requests.post>(base_url + 
f"/admin/v2/namespaces/{ns}/permissions/my-role", data=json.dumps(["consume"]), 
headers={'content-type': 'application/json'})```

----
2020-06-18 16:40:23 UTC - Prashanth Tirupachur Vasanthakrishnan: Can use the 
return-code (r.status_code) and response (r.text) for looking more into it.
----
2020-06-18 16:44:07 UTC - Matteo Merli: Yes
----
2020-06-18 16:46:36 UTC - Matteo Merli: @Sankararao Routhu Take a look at 
<https://github.com/apache/pulsar/blob/master/pulsar-proxy/src/main/java/org/apache/pulsar/proxy/server/ProxyConnection.java#L123>

That's where the connection gets accepted. That's where it should get closed if 
it's not allowed.
----
2020-06-18 16:48:37 UTC - Sijie Guo: producer will be stopped if the function 
is stopped.
----
2020-06-18 16:53:39 UTC - Logan B: The link for track 3 on the pulsar website 
is incorrect.

The link is for meeting # 8521331335, but it should be 85213313359
+1 : Julius S
----
2020-06-18 16:54:57 UTC - Alexander Ursu: I also saw the raw swagger.json which 
generates the docs, and I don't think there even is a definition for the 
request body, so now I doubt this endpoint even works and accepts it
----
2020-06-18 16:58:38 UTC - Ebere Abanonu: You can find tools to generate client 
for your language using the swagger file. Look at the function used to create 
tenants, you will learn how to add body to your request.
----
2020-06-18 17:01:33 UTC - Alexander Ursu: Ah I see. It's also more about just 
knowing what key/values are accepted for the namespace permissions endpoint. 
This could just be a lack of documentation
----
2020-06-18 17:08:31 UTC - Endre Karlson: Hi guys, is the PulsarCon recorded to 
Youtube?
----
2020-06-18 17:14:47 UTC - Oleg Kozlov: Perfect, thank you , that should work 
for us
----
2020-06-18 17:18:29 UTC - Pavels Sisojevs: exactly. In my case I have a router 
function which receives a message, fans out the message to few other topics and 
never produces any messages to the topics again. Is there a way to enforce 
removing the producer in the function? of course, alternatively I could create 
a pulsar client inside of the function (once, on the initialisation), then 
create publisher on every message and close it myself. not sure it’s an 
efficient way of doing this though
----
2020-06-18 17:28:32 UTC - Jim Smith: @Jim Smith has joined the channel
----
2020-06-18 17:32:56 UTC - Sijie Guo: It is fixed now.
+1 : Logan B
----
2020-06-18 17:33:24 UTC - Sijie Guo: Yes it is recorded and will be uploaded to 
Youtube after the summit.
----
2020-06-18 17:34:21 UTC - Sijie Guo: which one are you mentioning here?
----
2020-06-18 17:34:49 UTC - anbutech17: Could you please someone help how to set 
up a pulsar project in pycharm(I'm looking for python + pulsar dev environment) 
to learn the pulsar APIs.I could see lot of java maven project setup with 
intelij idea IDE.please share some suggestions
heavy_plus_sign : Caito Scherr
----
2020-06-18 17:35:52 UTC - Sijie Guo: Are most of functions using parallelism 1?
----
2020-06-18 17:38:07 UTC - Sijie Guo: I see. Can you create a github issue for 
it? We can see how to add the support.
+1 : Tamer
----
2020-06-18 17:38:18 UTC - Pavels Sisojevs: ok, will do
----
2020-06-18 17:41:01 UTC - Alex Yaroslavsky: @Sijie Guo yes, most are at the 
moment 
----
2020-06-18 18:00:13 UTC - Ebere Abanonu: `public void getMessageByID`
----
2020-06-18 18:00:48 UTC - Ebere Abanonu: What happens when I call that method?
----
2020-06-18 18:06:52 UTC - Sijie Guo: I mean in which class?
----
2020-06-18 18:06:58 UTC - Sijie Guo: sorry which library?
----
2020-06-18 18:25:28 UTC - Ebere Abanonu: `broker\admin\v2\PersistentTopics.java`
----
2020-06-18 18:25:56 UTC - Ebere Abanonu: ```
    @GET
    @Path("/{tenant}/{namespace}/{topic}/ledger/{ledgerId}/entry/{entryId}")
    @ApiOperation(value = "Get message by its messageId.")
    @ApiResponses(value = {
            @ApiResponse(code = 307, message = "Current broker doesn't serve 
the namespace of this topic"),
            @ApiResponse(code = 401, message = "Don't have permission to 
administrate resources on this tenant or" +
                    "subscriber is not authorized to access this operation"),
            @ApiResponse(code = 403, message = "Don't have admin permission"),
            @ApiResponse(code = 404, message = "Topic, subscription or the 
message position does not exist"),
            @ApiResponse(code = 405, message = "Skipping messages on a 
non-persistent topic is not allowed"),
            @ApiResponse(code = 412, message = "Topic name is not valid"),
            @ApiResponse(code = 500, message = "Internal server error"),
            @ApiResponse(code = 503, message = "Failed to validate global 
cluster configuration")})
    public void getMessageById(
            @Suspended final AsyncResponse asyncResponse,
            @ApiParam(value = "Specify the tenant", required = true)
            @PathParam("tenant") String tenant,
            @ApiParam(value = "Specify the namespace", required = true)
            @PathParam("namespace") String namespace,
            @ApiParam(value = "Specify topic name", required = true)
            @PathParam("topic") @Encoded String encodedTopic,
            @ApiParam(value = "The ledger id", required = true)
            @PathParam("ledgerId") long ledgerId,
            @ApiParam(value = "The entry id", required = true)
            @PathParam("entryId") long entryId,
            @ApiParam(value = "Is authentication required to perform this 
operation")
            @QueryParam("authoritative") @DefaultValue("false") boolean 
authoritative)```
----
2020-06-18 18:42:20 UTC - Sijie Guo: ```AsyncResponse is used for sending the 
response.```
----
2020-06-18 19:09:41 UTC - Pedro Cardoso: please share the link when it's 
available!
----
2020-06-18 19:29:08 UTC - Jesse Anderson: I think they'll be sending out an 
email once they're up on YouTube
----
2020-06-18 19:33:09 UTC - Daniel Kopeinig: @Daniel Kopeinig has joined the 
channel
----
2020-06-18 19:41:12 UTC - Addison Higham: huh, random question that I realize I 
don't know: how does `-Xmx` and `-XX:MaxDirectMemorySize` interact? I knoew if 
you don't set MaxDirectMemorySize it is a fraction of your heap, but if you set 
MaxDirectMemorySize is the total process size heap + direct memory? or does it 
carve that chunk out of the heap?
----
2020-06-18 19:43:44 UTC - Matteo Merli: If `XX:MaxDirectMemorySize` is not set, 
the JVM will set it to the same as `-Xmx`.

The 2 limits are independent. So, if you have `-Xmx=2G 
-XX:MaxDirectMemorySize=1G` , your process can take up to 3G. Well, actually 
that doesn't account for JVM internal memory usage (eg: compiler, class cache, 
GC state, etc..)
----
2020-06-18 19:45:48 UTC - Addison Higham: okay that is what I thought
----
2020-06-18 19:45:54 UTC - Addison Higham: but was struggling to find direct 
confirmation
----
2020-06-18 19:46:44 UTC - Addison Higham: which BTW, the old helm charts were 
really confusing about, they would have 16 GB of heap AND 16 GB of direct 
memory, but only asked for 16gb of memory from k8s, which is why I was suddenly 
questioning
----
2020-06-18 19:48:28 UTC - Addison Higham: @Matteo Merli do you have more 
guidance on how much Pulsar should actually use in direct memory? I assume it 
is just netty? what about bookkeeper?
----
2020-06-18 19:52:19 UTC - Matteo Merli: Brokers --&gt; netty pooling. If you 
have very high throughput, or many connections,  a larger amount of direct mem 
will help

Bookies --&gt; BK is not relying on page cache, rather we prefer to allocate a 
"real" chunk of mem so that it's guaranteed to be mem and not get blocked on 
disk IO.

For that, BK uses direct for Netty IO, plus Write-Cache and ReadCache.

By default, these caches are set relative to the direct mem size (eg: 25% each)
----
2020-06-18 19:58:34 UTC - Markus Steininger: @Markus Steininger has joined the 
channel
----
2020-06-18 20:00:15 UTC - Matteo Merli: Again, if you're bookie has high IO 
rate, use sensible values for direct mem.

eg: at 300 MB/s of writes, you might want to have a write cache big enough to 
buffer for at least several seconds.

eg: for ~5 secs  that would be 2 GB. That would mean a direct mem size of 8G at 
least.
----
2020-06-18 20:22:40 UTC - Jesse Anderson: 
tada : Shivji Kumar Jha, Sijie Guo, Caito Scherr
+1 : Karthik Ramasamy
----
2020-06-19 03:32:42 UTC - Addison Higham: Forgot to mention that was really 
helpful
----
2020-06-19 03:38:10 UTC - Addison Higham: Observed some interesting things 
today as I added a new workload: it wasn't a huge volume of messages compared 
to where we have peaked (about 30k msgs/sec) but it came from a lot more 
producers (~900). We routinely handle burst workloads of 25-30k msgs/sec but 
from only a handful of producers. This load from many producers was much much 
more stressful on the cluster. Which I would expect, but just surprised to the 
degree
----
2020-06-19 03:38:57 UTC - Matteo Merli: Probably due to much less batching on 
the producers side
----
2020-06-19 03:39:09 UTC - Addison Higham: Which is making me just wonder all of 
why? Certainly servicing all those sockets (which I wonder if I need to tune)
----
2020-06-19 03:39:40 UTC - Addison Higham: I will need to look, maybe I might 
need to increase batch timeout a bit
----
2020-06-19 03:40:23 UTC - Matteo Merli: You can compare the message publish 
rate with bk entry write (for a given topic/namespace)
----
2020-06-19 03:40:46 UTC - Matteo Merli: the ratio between the 2 would be the 
avg number of messages per batch
----
2020-06-19 03:42:42 UTC - Addison Higham: Good idea, will compare the two, I 
think part of the problem was also it seemed like hashing on my 5 partions was 
somewhat unlucky, it split all the way up to 96 bundles and then I added more 
partitions and it smoothed it out :shrug:
----
2020-06-19 04:06:41 UTC - Jeff Schneller: Are there binaries for the c++ client 
on windows?  I only see Linux and MacOS.  Looking to save time but not having 
to build myself.
----
2020-06-19 06:02:40 UTC - Joe Francis: @Matteo Merli  it wont be a bad idea to 
have a bulk API for addentry - if striping is not in use
----
2020-06-19 06:07:16 UTC - Matteo Merli: it's not that easy. Batching is very 
efficient because we offload all the work to the clients. brokers and bookies 
are not looking into it.

Having a bulk add-entry that broker triggers vs sending multiple RPC 
(pipelined) won't be saving a huge amount of CPU (my expectation)
----
2020-06-19 06:27:58 UTC - Joe Francis: Not batching - bulk add API for BK  all 
the way to device
----

Slack digest for #general - 2020-06-19

Reply via email to