Oh one other thought.  It could have been a client of another process
holding that port, say hdfs.  You could use the linux command lsof to
verify that is the problem.  If it's a normal occurrence you could use
random ports on the shard servers, so they always find an open port when
they start.

On Tue, Apr 26, 2016 at 8:21 AM, Aaron McCurry <[email protected]> wrote:

> I'm guessing if the process was in an OOM state and when you killed the
> proc is turned into a zombie process for a moment.  I have seen that happen
> with java processes before.  There's not a lot that can be down except wait
> for the process to be terminated.
>
> Anyone else with other thoughts?
>
> On Tue, Apr 26, 2016 at 2:50 AM, Ravikumar Govindarajan <
> [email protected]> wrote:
>
>> A shard-server was heavily loaded yesterday & ultimately crashed with an
>> OOM.
>>
>> I tried to restart the shard-server but it quit with the following error
>>
>> INFO  20160425_05:10:02:556_PDT [main] thrift.ThriftBlurShardServer:
>> Setting up Shard Server
>> INFO  20160425_05:10:02:581_PDT [main] thrift.ThriftServer: ulimit:
>> core file size          (blocks, -c) unlimited
>> INFO  20160425_05:10:02:581_PDT [main] thrift.ThriftServer: ulimit:
>> data seg size           (kbytes, -d) unlimited
>> INFO  20160425_05:10:02:581_PDT [main] thrift.ThriftServer: ulimit:
>> scheduling priority             (-e) 0
>> INFO  20160425_05:10:02:581_PDT [main] thrift.ThriftServer: ulimit:
>> file size               (blocks, -f) unlimited
>> INFO  20160425_05:10:02:581_PDT [main] thrift.ThriftServer: ulimit:
>> pending signals                 (-i) 1031474
>> INFO  20160425_05:10:02:582_PDT [main] thrift.ThriftServer: ulimit:
>> max locked memory       (kbytes, -l) 64
>> INFO  20160425_05:10:02:582_PDT [main] thrift.ThriftServer: ulimit:
>> max memory size         (kbytes, -m) unlimited
>> INFO  20160425_05:10:02:582_PDT [main] thrift.ThriftServer: ulimit:
>> open files                      (-n) 65536
>> INFO  20160425_05:10:02:582_PDT [main] thrift.ThriftServer: ulimit:
>> pipe size            (512 bytes, -p) 8
>> INFO  20160425_05:10:02:582_PDT [main] thrift.ThriftServer: ulimit:
>> POSIX message queues     (bytes, -q) 819200
>> INFO  20160425_05:10:02:582_PDT [main] thrift.ThriftServer: ulimit:
>> real-time priority              (-r) 0
>> INFO  20160425_05:10:02:582_PDT [main] thrift.ThriftServer: ulimit:
>> stack size              (kbytes, -s) 10240
>> INFO  20160425_05:10:02:582_PDT [main] thrift.ThriftServer: ulimit:
>> cpu time               (seconds, -t) unlimited
>> INFO  20160425_05:10:02:582_PDT [main] thrift.ThriftServer: ulimit:
>> max user processes              (-u) 3047
>> INFO  20160425_05:10:02:582_PDT [main] thrift.ThriftServer: ulimit:
>> virtual memory          (kbytes, -v) unlimited
>> INFO  20160425_05:10:02:582_PDT [main] thrift.ThriftServer: ulimit:
>> file locks                      (-x) unlimited
>> INFO  20160425_05:10:02:686_PDT [main] utils.GCWatcherJdk7:
>> GCWatcherJdk7 was setup.
>> ERROR 20160425_05:10:02:719_PDT [main]
>> concurrent.SimpleUncaughtExceptionHandler: Unknown error in thread
>> [Thread[main,5,main]]
>> org.apache.blur.thirdparty.thrift_0_9_0.transport.TTransportException:
>> Could not create ServerSocket on address /0.0.0.0:40020.
>>         at
>> org.apache.blur.thirdparty.thrift_0_9_0.transport.TNonblockingServerSocket.<init>(TNonblockingServerSocket.java:91)
>>         at
>> org.apache.blur.thirdparty.thrift_0_9_0.transport.TNonblockingServerSocket.<init>(TNonblockingServerSocket.java:73)
>>         at
>> org.apache.blur.thrift.ThriftServer.getTNonblockingServerSocket(ThriftServer.java:246)
>>         at
>> org.apache.blur.thrift.ThriftBlurShardServer.createServer(ThriftBlurShardServer.java:155)
>>         at
>> org.apache.blur.thrift.ThriftBlurShardServer.main(ThriftBlurShardServer.java:139)
>>
>> The shard-server process was killed but thrift port seemed to be hanging
>> on
>> (Guess it was on TIMED_WAIT) & not released.
>>
>> I also saw HttpJettyServer reporting the same issue when I attempted
>> restart for a second time...
>>
>> *Caused by: java.net.BindException: Address already in use*
>> at java.net.PlainSocketImpl.socketBind(Native Method)
>> at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
>> at java.net.ServerSocket.bind(ServerSocket.java:376)
>> at java.net.ServerSocket.(ServerSocket.java:237)
>> at java.net.ServerSocket.(ServerSocket.java:181)
>> at
>>
>> org.mortbay.jetty.bio.SocketConnector.newServerSocket(SocketConnector.java:80)
>> at org.mortbay.jetty.bio.SocketConnector.open(SocketConnector.java:73)
>> at org.mortbay.jetty.AbstractConnector.doStart(AbstractConnector.java:283)
>> at org.mortbay.jetty.bio.SocketConnector.doStart(SocketConnector.java:147)
>> at
>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>> at org.mortbay.jetty.Server.doStart(Server.java:235)
>> at
>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>> at org.apache.blur.gui.HttpJettyServer.(HttpJettyServer.java:93)
>>
>> Under heavy loads will ports be hanging around even after process gets
>> killed?
>>
>> Any known work-arounds for this?
>>
>> --
>> Ravi
>>
>
>

Reply via email to