Startup failure (Core dump) in Solaris 11 + JDK 1.8.0
Hi all, I'm trying to install Cassandra 2.1.2 in Solaris 11 but I'm getting a core dump at startup. Any help is appreciated, since I can't change the operating system... *My setup is:* - Solaris 11 - JDK build 1.8.0_25-b17 *The error:* appserver02:/opt/apache-cassandra-2.1.2/bin$ ./cassandra appserver02:/opt/apache-cassandra-2.1.2/bin$ CompilerOracle: inline org/apache/cassandra/db/AbstractNativeCell.compareTo (Lorg/apache/cassandra/db/composites/Composite;)I CompilerOracle: inline org/apache/cassandra/db/composites/AbstractSimpleCellNameType.compareUnsigned (Lorg/apache/cassandra/db/composites/Composite;Lorg/apache/cassandra/db/composites/Composite;)I CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare (Ljava/nio/ByteBuffer;[B)I CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare ([BLjava/nio/ByteBuffer;)I CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/lang/Object;JI)I CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/nio/ByteBuffer;)I CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I INFO 14:08:07 Hostname: appserver02.local INFO 14:08:07 Loading settings from file:/opt/apache-cassandra-2.1.2/conf/cassandra.yaml INFO 14:08:08 Node configuration:[authenticator=AllowAllAuthenticator; authorizer=AllowAllAuthorizer; auto_snapshot=true; batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; cas_contention_timeout_in_ms=1000; client_encryption_options=; cluster_name=Test Cluster; column_index_size_in_kb=64; commit_failure_policy=stop; commitlog_segment_size_in_mb=32; commitlog_sync=periodic; commitlog_sync_period_in_ms=1; compaction_throughput_mb_per_sec=16; concurrent_counter_writes=32; concurrent_reads=32; concurrent_writes=32; counter_cache_save_period=7200; counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=5000; cross_node_timeout=false; disk_failure_policy=stop; dynamic_snitch_badness_threshold=0.1; dynamic_snitch_reset_interval_in_ms=60; dynamic_snitch_update_interval_in_ms=100; endpoint_snitch=SimpleSnitch; hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; incremental_backups=false; index_summary_capacity_in_mb=null; index_summary_resize_interval_in_minutes=60; inter_dc_tcp_nodelay=false; internode_compression=all; key_cache_save_period=14400; key_cache_size_in_mb=null; listen_address=localhost; max_hint_window_in_ms=1080; max_hints_delivery_threads=2; memtable_allocation_type=heap_buffers; native_transport_port=9042; num_tokens=256; partitioner=org.apache.cassandra.dht.Murmur3Partitioner; permissions_validity_in_ms=2000; range_request_timeout_in_ms=1; read_request_timeout_in_ms=5000; request_scheduler=org.apache.cassandra.scheduler.NoScheduler; request_timeout_in_ms=1; row_cache_save_period=0; row_cache_size_in_mb=0; rpc_address=localhost; rpc_keepalive=true; rpc_port=9160; rpc_server_type=sync; seed_provider=[{class_name=org.apache.cassandra.locator.SimpleSeedProvider, parameters=[{seeds=127.0.0.1}]}]; server_encryption_options=; snapshot_before_compaction=false; ssl_storage_port=7001; sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; start_rpc=true; storage_port=7000; thrift_framed_transport_size_in_mb=15; tombstone_failure_threshold=10; tombstone_warn_threshold=1000; trickle_fsync=false; trickle_fsync_interval_in_kb=10240; truncate_request_timeout_in_ms=6; write_request_timeout_in_ms=2000] INFO 14:08:09 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 14:08:09 Global memtable on-heap threshold is enabled at 1004MB INFO 14:08:09 Global memtable off-heap threshold is enabled at 1004MB # # A fatal error has been detected by the Java Runtime Environment: # # SIGBUS (0xa) at pc=0x7cc5f100, pid=823, tid=2 # # JRE version: Java(TM) SE Runtime Environment (8.0_25-b17) (build 1.8.0_25-b17) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.25-b02 mixed mode solaris-sparc compressed oops) # Problematic frame: # V [libjvm.so+0xd5f100] Unsafe_GetInt+0x174 # # Core dump written. Default location: /opt/apache-cassandra-2.1.2/bin/core or core.823 # # An error report file with more information is saved as: # /opt/apache-cassandra-2.1.2/bin/hs_err_pid823.log
Re: Startup failure (Core dump) in Solaris 11 + JDK 1.8.0
Hi, Yes, with JDK1.7 it works but only in 32bits mode. It seems the problem is with the 64bits version of JDK8 and 7. Didn't try with other older versions. Unfortunately with 32bits I'm more limited in the memory I can make available for the JVM... Looking the Web, there are other's complaining with the same problem for a while but until now I haven't found a solution. It's interesting that many are redirect the problem for the JVM (in solaris). I think that waiting for possible JVM update that might or not resolve this, is not a solution. Has a kind of request :-) It would be great that some change in Cassandra source code could resolve this. On 01/12/2015 04:05 PM, Asit KAUSHIK wrote: Probably a bad answers but I was able to run on 1.7 jdk .So if possible can downsize you jdk version and try. I hit the block on RedHat enterprise... On Jan 12, 2015 9:31 PM, "Bernardino Mota" <mailto:bernardino.m...@inovaworks.com>> wrote: Hi all, I'm trying to install Cassandra 2.1.2 in Solaris 11 but I'm getting a core dump at startup. Any help is appreciated, since I can't change the operating system... *My setup is:* - Solaris 11 - JDK build 1.8.0_25-b17 *The error:* appserver02:/opt/apache-cassandra-2.1.2/bin$ ./cassandra appserver02:/opt/apache-cassandra-2.1.2/bin$ CompilerOracle: inline org/apache/cassandra/db/AbstractNativeCell.compareTo (Lorg/apache/cassandra/db/composites/Composite;)I CompilerOracle: inline org/apache/cassandra/db/composites/AbstractSimpleCellNameType.compareUnsigned(Lorg/apache/cassandra/db/composites/Composite;Lorg/apache/cassandra/db/composites/Composite;)I CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare (Ljava/nio/ByteBuffer;[B)I CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare ([BLjava/nio/ByteBuffer;)I CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/lang/Object;JI)I CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/nio/ByteBuffer;)I CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I INFO 14:08:07 Hostname: appserver02.local INFO 14:08:07 Loading settings from file:/opt/apache-cassandra-2.1.2/conf/cassandra.yaml INFO 14:08:08 Node configuration:[authenticator=AllowAllAuthenticator; authorizer=AllowAllAuthorizer; auto_snapshot=true; batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; cas_contention_timeout_in_ms=1000; client_encryption_options=; cluster_name=Test Cluster; column_index_size_in_kb=64; commit_failure_policy=stop; commitlog_segment_size_in_mb=32; commitlog_sync=periodic; commitlog_sync_period_in_ms=1; compaction_throughput_mb_per_sec=16; concurrent_counter_writes=32; concurrent_reads=32; concurrent_writes=32; counter_cache_save_period=7200; counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=5000; cross_node_timeout=false; disk_failure_policy=stop; dynamic_snitch_badness_threshold=0.1; dynamic_snitch_reset_interval_in_ms=60; dynamic_snitch_update_interval_in_ms=100; endpoint_snitch=SimpleSnitch; hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; incremental_backups=false; index_summary_capacity_in_mb=null; index_summary_resize_interval_in_minutes=60; inter_dc_tcp_nodelay=false; internode_compression=all; key_cache_save_period=14400; key_cache_size_in_mb=null; listen_address=localhost; max_hint_window_in_ms=1080; max_hints_delivery_threads=2; memtable_allocation_type=heap_buffers; native_transport_port=9042; num_tokens=256; partitioner=org.apache.cassandra.dht.Murmur3Partitioner; permissions_validity_in_ms=2000; range_request_timeout_in_ms=1; read_request_timeout_in_ms=5000; request_scheduler=org.apache.cassandra.scheduler.NoScheduler; request_timeout_in_ms=1; row_cache_save_period=0; row_cache_size_in_mb=0; rpc_address=localhost; rpc_keepalive=true; rpc_port=9160; rpc_server_type=sync; seed_provider=[{class_name=org.apache.cassandra.locator.SimpleSeedProvider, parameters=[{seeds=127.0.0.1}]}]; server_encryption_options=; snapshot_before_compaction=false; ssl_storage_port=7001; sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; start_rpc=true; storage_port=7000; thrift_framed_transport_size_in_mb=15; tombstone_failure_threshold=10; tombstone_warn_threshold=1000;
Nodes fail to reconnect after several hours of network failure.
Using Cassandra 2.2.4 on Ubuntu. We have a cluster with two nodes that during several hours failed to connect with each other due to network problems. The database continued to be used in one of the nodes with writes being stored in the Hints file as supposed. But now that the network is OK again and each machine can communicate we see that each node indicates the other is DOWN and does not replicates. When the network came up we started to see in log files "Convicting /192.168.1.102 with status NORMAL - alive false" It seems each node evictions each other and later failing to reconnect. Is there some configuration that we might be missing ? Any help would be much appreciated. - NODE 192.168.1.10 - "nodetool status” Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens OwnsHost ID Rack DN 192.168.1.102 12.02 MB 256 ? ff906882-8224-40ac-8cdb-98f5e725814d rack1 Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens OwnsHost ID Rack UN 192.168.1.10 41.87 MB 256 ? 51650afd-84dd-4e25-a6f0-13627858d5dc rack1 - NODE 192.168.1.102 - “nodetool status" Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens OwnsHost ID Rack UN 192.168.1.102 12.4 MB256 ? ff906882-8224-40ac-8cdb-98f5e725814d rack1 Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens OwnsHost ID Rack DN 192.168.1.10 26.31 MB 256 ? 51650afd-84dd-4e25-a6f0-13627858d5dc rack1
Re: Nodes fail to reconnect after several hours of network failure.
In the logs nothing strange but “nodetool gossipinfo” seems OK ./nodetool gossipinfo /192.168.1.10 generation:1453316804 heartbeat:206518 STATUS:18:NORMAL,-1003341236369672970 LOAD:206420:4.3533596E7 SCHEMA:14:6f97097b-45ce-3479-8b2f-af2fef4967e7 DC:8:DC2 RACK:10:rack1 RELEASE_VERSION:4:2.2.4 INTERNAL_IP:6:192.168.1.10 RPC_ADDRESS:3:127.0.0.1 SEVERITY:206517:0.0 NET_VERSION:1:9 HOST_ID:2:51650afd-84dd-4e25-a6f0-13627858d5dc RPC_READY:49:true TOKENS:17: /192.168.1.102 generation:1453316986 heartbeat:84622 STATUS:28:NORMAL,-1085177681742913545 LOAD:84535:1.2606418E7 SCHEMA:14:6f97097b-45ce-3479-8b2f-af2fef4967e7 DC:8:DC1 RACK:10:rack1 RELEASE_VERSION:4:2.2.4 INTERNAL_IP:6:10.0.2.10 RPC_ADDRESS:3:127.0.0.1 SEVERITY:84624:0.0 NET_VERSION:1:9 HOST_ID:2:ff906882-8224-40ac-8cdb-98f5e725814d RPC_READY:98:true TOKENS:27: > On 21 Jan 2016, at 13:17, Adil wrote: > > Hi, > do you see any message related to gossip info? > > > 2016-01-21 14:09 GMT+01:00 Bernardino Mota <mailto:bernardino.m...@knowledgeworks.pt>>: > Using Cassandra 2.2.4 on Ubuntu. > > We have a cluster with two nodes that during several hours failed to connect > with each other due to network problems. The database continued to be used in > one of the nodes with writes being stored in the Hints file as supposed. > > But now that the network is OK again and each machine can communicate we see > that each node indicates the other is DOWN and does not replicates. > > When the network came up we started to see in log files "Convicting > /192.168.1.102 <http://192.168.1.102/> with status NORMAL - alive false" > > It seems each node evictions each other and later failing to reconnect. > > Is there some configuration that we might be missing ? Any help would be much > appreciated. > > > > - NODE 192.168.1.10 - "nodetool status” > > Datacenter: DC1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens OwnsHost ID >Rack > DN 192.168.1.102 12.02 MB 256 ? > ff906882-8224-40ac-8cdb-98f5e725814d rack1 > Datacenter: DC2 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens OwnsHost ID >Rack > UN 192.168.1.10 41.87 MB 256 ? > 51650afd-84dd-4e25-a6f0-13627858d5dc rack1 > > > > - NODE 192.168.1.102 - “nodetool status" > > Datacenter: DC1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens OwnsHost ID >Rack > UN 192.168.1.102 12.4 MB256 ? > ff906882-8224-40ac-8cdb-98f5e725814d rack1 > Datacenter: DC2 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens OwnsHost ID >Rack > DN 192.168.1.10 26.31 MB 256 ? > 51650afd-84dd-4e25-a6f0-13627858d5dc rack1 > > >