2019-03-15 09:52:51 UTC - Kev Jackson: @Sijie Guo that sounds like a plan - 
start in bk and add consul support in the same vein as etcd then if that is 
looking like it will work move up the stack
----
2019-03-15 10:19:35 UTC - jia zhai: @Sanjeev Kulkarni Is there any log that 
contained in class `org.apache.bookkeeper.stream.server.StorageServer` also 
showed in the bookie1.log?
----
2019-03-15 10:29:40 UTC - Shivji Kumar Jha: When I run a test, it fails with 
this exception
Caused by: com.github.dockerjava.api.exception.NotFoundException: 
{"message":"pull access denied for apachepulsar/pulsar-test-latest-version, 
repository does not exist or may require 'docker login'"}
----
2019-03-15 10:30:32 UTC - Sijie Guo: pulsar doesn’t publish 
pulsar-test-latest-version image.

you can run `mvn clean install -DskipTest -Pdocker` to produce a test image 
locally.
----
2019-03-15 10:31:23 UTC - Shivji Kumar Jha: ack, thanks!
----
2019-03-15 10:33:12 UTC - Shivji Kumar Jha: I have a patch but I need a quick 
way to start the server from source and run my test. Ideas on quick hacks to 
start broker server from source?
----
2019-03-15 10:41:22 UTC - Alexandre DUVAL: Do not hesitate to ask if i Can 
help. @Sijie Guo this patch does nt add header :p
----
2019-03-15 10:47:23 UTC - Shivji Kumar Jha: @Sijie Guo
----
2019-03-15 11:04:27 UTC - Sijie Guo: are you testing a change on client side or 
server side?
----
2019-03-15 11:05:13 UTC - Sijie Guo: @Alexandre DUVAL :slightly_smiling_face:
----
2019-03-15 11:05:24 UTC - Sijie Guo: is your admin talking to a broker or a 
proxy?
----
2019-03-15 11:05:50 UTC - Alexandre DUVAL: I tried directly to one broker
----
2019-03-15 11:06:06 UTC - Alexandre DUVAL: Should I go through proxy node?
----
2019-03-15 11:17:23 UTC - Shivji Kumar Jha: broker side - 
SchemaUpdateStrategyTest
----
2019-03-15 11:18:44 UTC - Shivji Kumar Jha: I am introducing an option to the 
disable schema update compatibility check on broker side. This test will be 
included in this test file. And then I have to run this test.
----
2019-03-15 11:22:45 UTC - Shivji Kumar Jha: @Sijie Guo
----
2019-03-15 11:53:05 UTC - Sijie Guo: oh i see. if you are reusing 
SchemaUpdateStrategyTest, unfortunately you have to generate a test image 
locally :-)
----
2019-03-15 11:54:16 UTC - Sijie Guo: oh you can write a separate test under 
pulsar-broker module, so you can use some mocking classes
----
2019-03-15 12:26:11 UTC - Darragh: all our commands were run with -b 0 from the 
start
----
2019-03-15 12:37:41 UTC - Alexandre DUVAL: Same issue.
----
2019-03-15 12:57:48 UTC - Darragh: although I've managed to get rid of most 
tail latencies, there are still some spikes going to 200+ms
----
2019-03-15 15:23:33 UTC - Matteo Merli: how frequent the spikes? What 
percentile do these affect?
----
2019-03-15 15:24:51 UTC - Matteo Merli: Typically there are 2 sources of 
latency spikes:
 1. Disk writes stalling for 100 ms or so (this happens on SSDs with or without 
fsyncs)
 2. JVM GC pauses
----
2019-03-15 15:25:16 UTC - Matteo Merli: for 1. doing w=3 a=2 will be able to 
smooth the latency
----
2019-03-15 15:26:13 UTC - Matteo Merli: for 2. same as above w=3 a=2 will 
smooth the latency for Bookies GC pauses, although broker (and client) pauses 
will  still be there
----
2019-03-15 15:31:49 UTC - Darragh: its for the 99 percentile ranges
----
2019-03-15 15:31:54 UTC - Darragh: 50 etc is fine
----
2019-03-15 15:32:27 UTC - Matteo Merli: (for improving GC pauses you should 
consider using Shenandoah or ZGc in Java11)
----
2019-03-15 15:32:44 UTC - Darragh: hm ok currently we've been using java8
----
2019-03-15 15:33:11 UTC - Matteo Merli: if you’re on RHEL/Centos that would 
come with Shenandoah
----
2019-03-15 15:33:30 UTC - Darragh: we're using amazon linux 2
----
2019-03-15 15:33:35 UTC - Darragh: so it's rhel based iirc
----
2019-03-15 15:33:56 UTC - Matteo Merli: Yes, I think it does have it by default
----
2019-03-15 15:34:13 UTC - Darragh: ok I'll try that then
----
2019-03-15 15:34:22 UTC - Darragh: we already are using w=3 a=2 as the default
----
2019-03-15 15:34:29 UTC - Darragh: in our broker conf
----
2019-03-15 15:34:45 UTC - Matteo Merli: Ok, can you then correlate the latency 
spikes with the GC pauses?
----
2019-03-15 15:35:03 UTC - Darragh: I'll have to recheck with grafana
----
2019-03-15 15:35:44 UTC - Darragh: and these would be gc pauses on the bookies 
right ?
----
2019-03-15 15:38:21 UTC - Darragh: I don't really see any spike showing up in 
the GC pauses graph
----
2019-03-15 15:41:04 UTC - Darragh: 
----
2019-03-15 15:41:49 UTC - Darragh: this is with -r 10000 -b 0
----
2019-03-15 15:42:06 UTC - Darragh: I'll try with java11 next week I guess
----
2019-03-15 15:44:03 UTC - Darragh: broker seems to have had some gc pause spikes
----
2019-03-15 15:44:51 UTC - Darragh: just 2 though and I've seen more latency 
spikes
----
2019-03-15 15:48:05 UTC - Darragh: yeah I see some pattern in the GC spikes of 
the broker about every ~1.40 minutes to 300/400ms
----
2019-03-15 15:48:16 UTC - Shivji Kumar Jha: Hi, I am running a test 
(SchemaUpdateStrategyTest) in debug mode and while the server is running I wish 
to check something using the rest APIs. Lets say
curl -X GET <http://localhost:32783/admin/v2/brokers/:cluster>

I cant get the curl working... wrong broker url? Please help!
----
2019-03-15 15:48:41 UTC - Shivji Kumar Jha: I see the test starts pulsar using 
docker, here is my docker ps
----
2019-03-15 15:48:47 UTC - Shivji Kumar Jha: 
----
2019-03-15 15:49:46 UTC - Shivji Kumar Jha: The docker thing worked well 
actually, thank you very much :slightly_smiling_face:
----
2019-03-15 15:55:50 UTC - Alexandre DUVAL: @Matteo Merli did you have time to 
work on it? Can I hlep you?
----
2019-03-15 15:56:29 UTC - Matteo Merli: Started but don’t have solution yet. 
Trying to get this completed today
----
2019-03-15 15:56:59 UTC - Alexandre DUVAL: Cool, do not hesitate to ping me for 
anything or when it's done :stuck_out_tongue:.
----
2019-03-15 15:57:00 UTC - Alexandre DUVAL: Thanks
----
2019-03-15 15:58:10 UTC - Matteo Merli: Yes, the publishing without batching is 
more intensive on GC. If you use a 1ms batching time, it would reduce that 
since broker will only deal in “batches”.

Other options are to increase JVM heap size to make the pauses less frequent
----
2019-03-15 15:59:44 UTC - Sanjeev Kulkarni: Hey @jia zhai i resolved the issue. 
it had to do with specifying the right quorum
+1 : jia zhai
----
2019-03-15 16:10:08 UTC - Maarten Tielemans: Thanks for the feedback @Matteo 
Merli We'll look into Java11, prob Monday. We will also do some testing with 
smaller msg/sec rate and bigger msg size. Hopefully those changes will resolve 
the last spikes
----
2019-03-15 16:28:48 UTC - Matteo Merli: I’d say the easiest fix might be to 
enable batching with 1ms group time
----
2019-03-15 18:18:40 UTC - Joe Francis: :+1: A few points to note though --&gt; 
Pulsar use of ZK is very different from Kafka. Pulsar does not allow clients to 
access ZK.   --&gt; It would also be  good to split the   metadata storage and 
cluster management functions so that they can use separate  services.
----
2019-03-15 18:48:07 UTC - Ali Ahmed: <http://localhost:32783/> is bookie url , 
you need to connect to localhost:32788
----
2019-03-15 19:49:33 UTC - JAYARAM NAGARAJAN: Hello:
Facing couple of issues with S3 Offloading as follows:

Setup broker.conf:
managedLedgerOffloadDriver=aws-s3
s3ManagedLedgerOffloadBucket=pulsar-topic-offload=&lt;ourBucket&gt;/temp/pulsar
s3ManagedLedgerOffloadRegion=us-east-1

 1) Auto offload issue
Set 1M as size for Threshold
        bin/pulsar-admin namespaces set-offload-threshold --size 1M nest/cicd

Sent a file more than 1Mb to the topic (went fine):
        bin/pulsar-client produce <persistent://nest/cicd/coaf_sql_revup>  -f 
licenseNew

Tried checking the status for offload, but did not run
        bin/pulsar-admin topics offload-status nest/cicd/coaf_sql_revup
        Offload has not been run for <persistent://nest/cicd/coaf_sql_revup> 
since broker startup

Repeated the process for many times so size increase more than 15 MB, still no 
luck....

2) Manual offload issue
bin/pulsar-admin topics offload --size-threshold 1M nest/cicd/coaf_sql_revup
Offload triggered for <persistent://nest/cicd/coaf_sql_revup> for messages 
before 65:0:-1
[root@ip-10-207-192-140 apache-pulsar-2.3.0]# bin/pulsar-admin topics 
offload-status nest/cicd/coaf_sql_revup
Offload was a success

  Though i get success here, but in the s3 path 
s3://&lt;ourBucket&gt;/test/pulsar  we do not see anything. Expected some 
offloaded file to show up

  Please let me know what needs to be done....
----
2019-03-15 20:02:49 UTC - David Kjerrumgaard: @JAYARAM NAGARAJAN How are you 
providing your AWS credentials to Pulsar?
----
2019-03-15 20:02:52 UTC - David Kjerrumgaard: "To be able to access AWS S3, you 
need to authenticate with AWS S3. Pulsar does not provide any direct means of 
configuring authentication for AWS S3, but relies on the mechanisms supported 
by the DefaultAWSCredentialsProviderChain."
----
2019-03-15 20:21:44 UTC - JAYARAM NAGARAJAN: @David Kjerrumgaard Our Ec2 has 
IAM role which has full access to my S3 bucket and guessing by the status it 
gave me "Offload was a success" i am guessing that it got the required 
authentication... else should have seen an error...
----
2019-03-15 20:42:19 UTC - Ashwin: @Ashwin has joined the channel
----
2019-03-15 21:02:49 UTC - David Kjerrumgaard: @JAYARAM NAGARAJAN Are there any 
messages in the log file?
----
2019-03-15 21:04:32 UTC - David Kjerrumgaard: @JAYARAM NAGARAJAN Also, does the 
bucket path in S3 already exist?, .i.e is there a 
`&lt;ourBucket&gt;/test/pulsar` inside S3 currently?
----
2019-03-15 21:10:23 UTC - JAYARAM NAGARAJAN: @David Kjerrumgaard We started 
this as a standalone pulsar mode and when we ran the cli commands as above , we 
did not see any log file, is there a place to see the logs getting written?   
Also YES the S3 path exists, we have some sample files in the path 
&lt;ourBucket&gt;/test/pulsar currently
----
2019-03-15 21:11:10 UTC - Ali Ahmed: @JAYARAM NAGARAJAN is this a standalone 
container ?
----
2019-03-15 21:13:26 UTC - JAYARAM NAGARAJAN: @Ali Ahmed This is not a 
container, but bare metal ec2 single node and i am running pulsar as standalone 
in this
----
2019-03-15 21:14:56 UTC - Ali Ahmed: if you used the pulsar standalone can you 
check the stdout from the process
----
2019-03-15 21:15:25 UTC - Ali Ahmed: also can you try with a root level empty 
s3 bucket
----
2019-03-15 21:15:37 UTC - Ali Ahmed: so that we can better isolate the problem
----
2019-03-15 22:30:49 UTC - JAYARAM NAGARAJAN: This is what i am getting in the 
logs now, now we updated the s3 bucket to just the root bucket 
&lt;our_bucket&gt; and re-ran ... now offloading does not work
----
2019-03-15 22:34:11 UTC - Ali Ahmed: @JAYARAM NAGARAJAN this makes more sense I 
can there still maybe a permission issue , can’t tell from the logs will have 
to check the code.
----
2019-03-16 00:25:26 UTC - David Kjerrumgaard: Based on the log output, it looks 
like you are using the the 
`org.apache.bookkeeper.mledger.impl.NullLedgerOffloader`, which only happens if 
the  offload driver property isn't set
----
2019-03-16 00:34:26 UTC - David Kjerrumgaard: So for some reason, Pulsar is not 
seeing the `managedLedgerOffloadDriver` setting in your broker.conf file.
----
2019-03-16 00:37:14 UTC - David Kjerrumgaard: @JAYARAM NAGARAJAN Make sure you 
only have one setting for the `managedLedgerOffloadDriver` property in your 
broker.conf and restart. Look for the following error message in the log file 
that indicates that Pulsar was find a value for that property.  `No ledger 
offloader configured, using NULL instance`
----
2019-03-16 07:16:51 UTC - naga: Guys...how about making this aws managed service
----
2019-03-16 07:17:27 UTC - naga: I can volunteer to manage this project
----
2019-03-16 07:51:29 UTC - Ali Ahmed: @naga the focus is make to pulsar work 
well with kubernetes as that’s seems to where the cloud providers are moving 
towards
----
2019-03-16 08:06:09 UTC - naga: Yaeh... good then... 
----
2019-03-16 08:06:22 UTC - naga: Any idea of when this would be available
----
2019-03-16 08:18:05 UTC - Shivji Kumar Jha: Doesn't work either. There is 
something thats eluding me since yesterday :thinking_face:
----
2019-03-16 09:06:46 UTC - Shivji Kumar Jha: Though I can go inside docker and 
get a response:
root@pulsar-broker-1:/pulsar# curl  
<http://pulsar-broker-1:8080/admin/v2/brokers/health>
ok
----

Reply via email to