xuesongxs opened a new issue, #15559:
URL: https://github.com/apache/pulsar/issues/15559

   **Describe the bug**
   Pulsar v2.8.1
   Pulsar cluster: 3 brokers
   Producer can‘t continue sending messages after all brokers are restarted.
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. Create producer
   ```
   public class PulsarProducerDemo3 {
       // 连接集群 broker
       private static String localClusterUrl = 
"pulsar://127.0.0.1:6650,127.0.0.1:6651,127.0.0.1:6652";
   
       public static void main(String[] args) {
           try {
               Producer<String> producer = getProducer();
   
               Long start = System.currentTimeMillis();
               int i = 0;
               while (i < 50000) {
                   producer.send(i + "");
                   System.out.println("send msg:" + i + "");
                   i++;
               }
           } catch (Exception e) {
               System.err.println("send fail:" + e);
           }
       }
   
       public static Producer<String> getProducer() throws Exception {
           PulsarClient client;
           Map<String, Object> prop = new HashMap<>();
           prop.put("topicName", "persistent://public/default/test-string3");
           client = PulsarClient.builder().serviceUrl(localClusterUrl).build();
           Producer<String> producer = client.newProducer(Schema.STRING)
                   .loadConf(prop)
                   .create();
           return producer;
       }
   }
   ```
   2. Run producer
   3. After sending 100 messages, stop all brokers
   4. See producer's log
   ```
   [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - 
[persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0] 
Reconnecting after connection was closed
   [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - 
Failed to open connection to 127.0.0.1:6650 : 
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: /127.0.0.1:6650
   [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler 
- [persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0] 
Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: 
java.util.concurrent.CompletionException: 
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: /127.0.0.1:6650
   [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler 
- [persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0] 
Could not get connection to broker: 
org.apache.pulsar.client.api.PulsarClientException: 
java.util.concurrent.CompletionException: 
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: /127.0.0.1:6650 -- Will try again in 1.496 s
   [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - 
[persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0] 
Reconnecting after connection was closed
   [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - 
Failed to open connection to 172.32.149.123:16650 : 
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: /172.32.149.123:16650
   [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler 
- [persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0] 
Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: 
java.util.concurrent.CompletionException: 
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: /127.0.0.1:6651
   [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler 
- [persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0] 
Could not get connection to broker: 
org.apache.pulsar.client.api.PulsarClientException: 
java.util.concurrent.CompletionException: 
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: /127.0.0.1:6651 -- Will try again in 3.193 s
   [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - 
[persistent://public/default/test-string3-partition-1] [pulsar-cluster-11-0] 
Reconnecting after connection was closed
   [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - 
Failed to open connection to 127.0.0.1:6652 : 
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: /127.0.0.1:6652
   [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler 
- [persistent://public/default/test-string3-partition-1] [pulsar-cluster-11-0] 
Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: 
java.util.concurrent.CompletionException: 
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: /127.0.0.1:6652
   [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler 
- [persistent://public/default/test-string3-partition-1] [pulsar-cluster-11-0] 
Could not get connection to broker: 
org.apache.pulsar.client.api.PulsarClientException: 
java.util.concurrent.CompletionException: 
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: /127.0.0.1:6652 -- Will try again in 2.97 s
   ```
   5. Start all brokers
   6. See producer's log
   ```
   [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConnectionPool - 
[[id: 0x7fa30072, L:/127.0.0.1:50232 - R:/127.0.0.1:6650]] Connected to server
   [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl - 
[persistent://public/default/test-string3-partition-2] [pulsar-cluster-14-0] 
Creating producer on cnx [id: 0x7fa30072, L:/127.0.0.1:50232 - 
R:/127.0.0.1:16651]
   [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - 
[persistent://public/default/test-string3-partition-0] [pulsar-cluster-14-1] 
Reconnecting after connection was closed
   [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConnectionPool - 
[[id: 0xa49808e4, L:/127.0.0.1:58935 - R:/127.0.0.1:6651]] Connected to server
   [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl - 
[persistent://public/default/test-string3-partition-0] [pulsar-cluster-14-1] 
Creating producer on cnx [id: 0xa49808e4, L:/127.0.0.1:58935 - 
R:/127.0.0.1:6652]
   ```
   Creating producer success, but producer can‘t continue sending messages 
after all brokers are restarted.
   7. Jstack producer's pid
   ```
   
   Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.251-b08 mixed mode):
   
   "Attach Listener" #13 daemon prio=9 os_prio=0 tid=0x00007fc510001000 
nid=0x4dc2 waiting on condition [0x00000000
   00000000]
      java.lang.Thread.State: RUNNABLE
   
   "DestroyJavaVM" #12 prio=5 os_prio=0 tid=0x00007fc53c009800 nid=0x4c40 
waiting on condition [0x0000000000000000]
      java.lang.Thread.State: RUNNABLE
   
   "pulsar-external-listener-3-1" #11 prio=5 os_prio=0 tid=0x00007fc4f40b3800 
nid=0x4c95 waiting on condition [0x00
   007fc5185f5000]
      java.lang.Thread.State: WAITING (parking)
           at sun.misc.Unsafe.park(Native Method)
           - parking to wait for  <0x000000078f4085b8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$Con
   ditionObject)
           at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
           at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronize
   r.java:2039)
           at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.ja
   va:1081)
           at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.ja
   va:809)
           at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at 
org.apache.pulsar.shade.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.
   java:30)
           at java.lang.Thread.run(Thread.java:748)
   
   "pulsar-timer-5-1" #10 prio=5 os_prio=0 tid=0x00007fc4f4065800 nid=0x4c4c 
waiting on condition [0x00007fc5192170
   00]
      java.lang.Thread.State: TIMED_WAITING (sleeping)
           at java.lang.Thread.sleep(Native Method)
           at 
org.apache.pulsar.shade.io.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:5
   66)
           at 
org.apache.pulsar.shade.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:462)
           at 
org.apache.pulsar.shade.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.
   java:30)
           at java.lang.Thread.run(Thread.java:748)
   
   "pulsar-client-io-1-1" #8 prio=5 os_prio=0 tid=0x00007fc53c40b800 nid=0x4c4b 
runnable [0x00007fc52c18a000]
      java.lang.Thread.State: RUNNABLE
           at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
           at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
           at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
           at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
           - locked <0x000000078f300d50> (a 
org.apache.pulsar.shade.io.netty.channel.nio.SelectedSelectionKeySet)
           - locked <0x000000078f300d68> (a 
java.util.Collections$UnmodifiableSet)
           - locked <0x000000078f300d08> (a sun.nio.ch.EPollSelectorImpl)
           at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
           at 
org.apache.pulsar.shade.io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelection
   KeySetSelector.java:62)
           at 
org.apache.pulsar.shade.io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:814)
           at 
org.apache.pulsar.shade.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:457)
           at 
org.apache.pulsar.shade.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExe
   cutor.java:986)
           at 
org.apache.pulsar.shade.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
           at 
org.apache.pulsar.shade.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.
   java:30)
           at java.lang.Thread.run(Thread.java:748)
   
   "Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00007fc53c0d4000 
nid=0x4c49 runnable [0x0000000000000000]
      java.lang.Thread.State: RUNNABLE
   
   "C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007fc53c0b7800 
nid=0x4c48 waiting on condition [0x000000
   0000000000]
      java.lang.Thread.State: RUNNABLE
   
   "C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007fc53c0b4800 
nid=0x4c47 waiting on condition [0x000000
   0000000000]
      java.lang.Thread.State: RUNNABLE
   
   ```
   
   **Expected behavior**
   A clear and concise description of what you expected to happen.
   
   **Screenshots**
   If applicable, add screenshots to help explain your problem.
   
   **Desktop (please complete the following information):**
    - OS: [e.g. iOS]
   
   **Additional context**
   Add any other context about the problem here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pulsar.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to