Startup failure (Core dump) in Solaris 11 + JDK 1.8.0

2015-01-12 Thread Bernardino Mota

Hi all,

I'm trying to install Cassandra 2.1.2 in Solaris 11 but I'm getting a 
core dump at startup.


Any help is appreciated, since I can't change the operating system...

*My setup is:*
- Solaris 11
- JDK build 1.8.0_25-b17


*The error:*

appserver02:/opt/apache-cassandra-2.1.2/bin$ ./cassandra
appserver02:/opt/apache-cassandra-2.1.2/bin$ CompilerOracle: inline 
org/apache/cassandra/db/AbstractNativeCell.compareTo 
(Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline 
org/apache/cassandra/db/composites/AbstractSimpleCellNameType.compareUnsigned 
(Lorg/apache/cassandra/db/composites/Composite;Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare 
(Ljava/nio/ByteBuffer;[B)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare 
([BLjava/nio/ByteBuffer;)I
CompilerOracle: inline 
org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned 
(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
CompilerOracle: inline 
org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo 
(Ljava/lang/Object;JILjava/lang/Object;JI)I
CompilerOracle: inline 
org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo 
(Ljava/lang/Object;JILjava/nio/ByteBuffer;)I
CompilerOracle: inline 
org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo 
(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I

INFO  14:08:07 Hostname: appserver02.local
INFO  14:08:07 Loading settings from 
file:/opt/apache-cassandra-2.1.2/conf/cassandra.yaml
INFO  14:08:08 Node configuration:[authenticator=AllowAllAuthenticator; 
authorizer=AllowAllAuthorizer; auto_snapshot=true; 
batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; 
cas_contention_timeout_in_ms=1000; client_encryption_options=; 
cluster_name=Test Cluster; column_index_size_in_kb=64; 
commit_failure_policy=stop; commitlog_segment_size_in_mb=32; 
commitlog_sync=periodic; commitlog_sync_period_in_ms=1; 
compaction_throughput_mb_per_sec=16; concurrent_counter_writes=32; 
concurrent_reads=32; concurrent_writes=32; 
counter_cache_save_period=7200; counter_cache_size_in_mb=null; 
counter_write_request_timeout_in_ms=5000; cross_node_timeout=false; 
disk_failure_policy=stop; dynamic_snitch_badness_threshold=0.1; 
dynamic_snitch_reset_interval_in_ms=60; 
dynamic_snitch_update_interval_in_ms=100; endpoint_snitch=SimpleSnitch; 
hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; 
incremental_backups=false; index_summary_capacity_in_mb=null; 
index_summary_resize_interval_in_minutes=60; inter_dc_tcp_nodelay=false; 
internode_compression=all; key_cache_save_period=14400; 
key_cache_size_in_mb=null; listen_address=localhost; 
max_hint_window_in_ms=1080; max_hints_delivery_threads=2; 
memtable_allocation_type=heap_buffers; native_transport_port=9042; 
num_tokens=256; partitioner=org.apache.cassandra.dht.Murmur3Partitioner; 
permissions_validity_in_ms=2000; range_request_timeout_in_ms=1; 
read_request_timeout_in_ms=5000; 
request_scheduler=org.apache.cassandra.scheduler.NoScheduler; 
request_timeout_in_ms=1; row_cache_save_period=0; 
row_cache_size_in_mb=0; rpc_address=localhost; rpc_keepalive=true; 
rpc_port=9160; rpc_server_type=sync; 
seed_provider=[{class_name=org.apache.cassandra.locator.SimpleSeedProvider, 
parameters=[{seeds=127.0.0.1}]}]; server_encryption_options=; 
snapshot_before_compaction=false; ssl_storage_port=7001; 
sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; 
start_rpc=true; storage_port=7000; 
thrift_framed_transport_size_in_mb=15; 
tombstone_failure_threshold=10; tombstone_warn_threshold=1000; 
trickle_fsync=false; trickle_fsync_interval_in_kb=10240; 
truncate_request_timeout_in_ms=6; write_request_timeout_in_ms=2000]
INFO  14:08:09 DiskAccessMode 'auto' determined to be mmap, 
indexAccessMode is mmap

INFO  14:08:09 Global memtable on-heap threshold is enabled at 1004MB
INFO  14:08:09 Global memtable off-heap threshold is enabled at 1004MB
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0xa) at pc=0x7cc5f100, pid=823, tid=2
#
# JRE version: Java(TM) SE Runtime Environment (8.0_25-b17) (build 
1.8.0_25-b17)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.25-b02 mixed mode 
solaris-sparc compressed oops)

# Problematic frame:
# V  [libjvm.so+0xd5f100]  Unsafe_GetInt+0x174
#
# Core dump written. Default location: 
/opt/apache-cassandra-2.1.2/bin/core or core.823

#
# An error report file with more information is saved as:
# /opt/apache-cassandra-2.1.2/bin/hs_err_pid823.log



Re: Startup failure (Core dump) in Solaris 11 + JDK 1.8.0

2015-01-13 Thread Bernardino Mota

Hi,

Yes, with JDK1.7 it works but only in 32bits mode. It seems the problem 
is with the 64bits version of JDK8 and 7. Didn't try with other older 
versions.


Unfortunately with 32bits I'm more limited in the memory I can make 
available for the JVM...


Looking the Web, there are other's complaining with the same problem for 
a while but until now I haven't found a solution.


It's interesting that many are redirect the problem for the JVM (in 
solaris). I think that waiting for possible JVM update that might or not 
resolve this, is not a solution.
Has a kind of request :-) It would be great that some change in 
Cassandra source code could resolve this.







On 01/12/2015 04:05 PM, Asit KAUSHIK wrote:


Probably a bad answers but I was able to run on 1.7 jdk .So if 
possible  can downsize you jdk version and try. I hit the block on 
RedHat enterprise...


On Jan 12, 2015 9:31 PM, "Bernardino Mota" 
<mailto:bernardino.m...@inovaworks.com>> wrote:


Hi all,

I'm trying to install Cassandra 2.1.2 in Solaris 11 but I'm
getting a core dump at startup.

Any help is appreciated, since I can't change the operating system...

*My setup is:*
- Solaris 11
- JDK build 1.8.0_25-b17


*The error:*

appserver02:/opt/apache-cassandra-2.1.2/bin$ ./cassandra
appserver02:/opt/apache-cassandra-2.1.2/bin$ CompilerOracle:
inline org/apache/cassandra/db/AbstractNativeCell.compareTo
(Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline

org/apache/cassandra/db/composites/AbstractSimpleCellNameType.compareUnsigned(Lorg/apache/cassandra/db/composites/Composite;Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline
org/apache/cassandra/utils/ByteBufferUtil.compare
(Ljava/nio/ByteBuffer;[B)I
CompilerOracle: inline
org/apache/cassandra/utils/ByteBufferUtil.compare
([BLjava/nio/ByteBuffer;)I
CompilerOracle: inline
org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned
(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
CompilerOracle: inline
org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo
(Ljava/lang/Object;JILjava/lang/Object;JI)I
CompilerOracle: inline
org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo
(Ljava/lang/Object;JILjava/nio/ByteBuffer;)I
CompilerOracle: inline
org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo
(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
INFO  14:08:07 Hostname: appserver02.local
INFO  14:08:07 Loading settings from
file:/opt/apache-cassandra-2.1.2/conf/cassandra.yaml
INFO  14:08:08 Node
configuration:[authenticator=AllowAllAuthenticator;
authorizer=AllowAllAuthorizer; auto_snapshot=true;
batch_size_warn_threshold_in_kb=5;
batchlog_replay_throttle_in_kb=1024;
cas_contention_timeout_in_ms=1000;
client_encryption_options=; cluster_name=Test Cluster;
column_index_size_in_kb=64; commit_failure_policy=stop;
commitlog_segment_size_in_mb=32; commitlog_sync=periodic;
commitlog_sync_period_in_ms=1;
compaction_throughput_mb_per_sec=16; concurrent_counter_writes=32;
concurrent_reads=32; concurrent_writes=32;
counter_cache_save_period=7200; counter_cache_size_in_mb=null;
counter_write_request_timeout_in_ms=5000;
cross_node_timeout=false; disk_failure_policy=stop;
dynamic_snitch_badness_threshold=0.1;
dynamic_snitch_reset_interval_in_ms=60;
dynamic_snitch_update_interval_in_ms=100;
endpoint_snitch=SimpleSnitch; hinted_handoff_enabled=true;
hinted_handoff_throttle_in_kb=1024; incremental_backups=false;
index_summary_capacity_in_mb=null;
index_summary_resize_interval_in_minutes=60;
inter_dc_tcp_nodelay=false; internode_compression=all;
key_cache_save_period=14400; key_cache_size_in_mb=null;
listen_address=localhost; max_hint_window_in_ms=1080;
max_hints_delivery_threads=2;
memtable_allocation_type=heap_buffers; native_transport_port=9042;
num_tokens=256;
partitioner=org.apache.cassandra.dht.Murmur3Partitioner;
permissions_validity_in_ms=2000;
range_request_timeout_in_ms=1;
read_request_timeout_in_ms=5000;
request_scheduler=org.apache.cassandra.scheduler.NoScheduler;
request_timeout_in_ms=1; row_cache_save_period=0;
row_cache_size_in_mb=0; rpc_address=localhost; rpc_keepalive=true;
rpc_port=9160; rpc_server_type=sync;
seed_provider=[{class_name=org.apache.cassandra.locator.SimpleSeedProvider,
parameters=[{seeds=127.0.0.1}]}];
server_encryption_options=;
snapshot_before_compaction=false; ssl_storage_port=7001;
sstable_preemptive_open_interval_in_mb=50;
start_native_transport=true; start_rpc=true; storage_port=7000;
thrift_framed_transport_size_in_mb=15;
tombstone_failure_threshold=10; tombstone_warn_threshold=1000;

Nodes fail to reconnect after several hours of network failure.

2016-01-21 Thread Bernardino Mota
Using Cassandra 2.2.4 on Ubuntu.

We have a cluster with two nodes that during several hours failed to connect 
with each other due to network problems. The database continued to be used in 
one of the nodes with writes being stored in the Hints file as supposed.

But now that the network is OK again and each machine can communicate we see 
that each node indicates the other is DOWN and does not replicates. 

When the network came up we started to see in log files "Convicting 
/192.168.1.102 with status NORMAL - alive false"

It seems each node evictions each other and later failing to reconnect.

Is there some configuration that we might be missing ? Any help would be much 
appreciated.

 

- NODE 192.168.1.10 - "nodetool status” 

Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens   OwnsHost ID  
 Rack
DN  192.168.1.102  12.02 MB   256  ?   
ff906882-8224-40ac-8cdb-98f5e725814d  rack1
Datacenter: DC2
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens   OwnsHost ID  
 Rack
UN  192.168.1.10   41.87 MB   256  ?   
51650afd-84dd-4e25-a6f0-13627858d5dc  rack1



- NODE 192.168.1.102  - “nodetool status"

Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens   OwnsHost ID  
 Rack
UN  192.168.1.102  12.4 MB256  ?   
ff906882-8224-40ac-8cdb-98f5e725814d  rack1
Datacenter: DC2
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens   OwnsHost ID  
 Rack
DN  192.168.1.10   26.31 MB   256  ?   
51650afd-84dd-4e25-a6f0-13627858d5dc  rack1




Re: Nodes fail to reconnect after several hours of network failure.

2016-01-21 Thread Bernardino Mota
In the logs nothing strange but “nodetool gossipinfo” seems OK

 ./nodetool gossipinfo
/192.168.1.10
  generation:1453316804
  heartbeat:206518
  STATUS:18:NORMAL,-1003341236369672970
  LOAD:206420:4.3533596E7
  SCHEMA:14:6f97097b-45ce-3479-8b2f-af2fef4967e7
  DC:8:DC2
  RACK:10:rack1
  RELEASE_VERSION:4:2.2.4
  INTERNAL_IP:6:192.168.1.10
  RPC_ADDRESS:3:127.0.0.1
  SEVERITY:206517:0.0
  NET_VERSION:1:9
  HOST_ID:2:51650afd-84dd-4e25-a6f0-13627858d5dc
  RPC_READY:49:true
  TOKENS:17:
/192.168.1.102
  generation:1453316986
  heartbeat:84622
  STATUS:28:NORMAL,-1085177681742913545
  LOAD:84535:1.2606418E7
  SCHEMA:14:6f97097b-45ce-3479-8b2f-af2fef4967e7
  DC:8:DC1
  RACK:10:rack1
  RELEASE_VERSION:4:2.2.4
  INTERNAL_IP:6:10.0.2.10
  RPC_ADDRESS:3:127.0.0.1
  SEVERITY:84624:0.0
  NET_VERSION:1:9
  HOST_ID:2:ff906882-8224-40ac-8cdb-98f5e725814d
  RPC_READY:98:true
  TOKENS:27:
  
 


> On 21 Jan 2016, at 13:17, Adil  wrote:
> 
> Hi,
> do you see any message related to gossip info?
> 
> 
> 2016-01-21 14:09 GMT+01:00 Bernardino Mota  <mailto:bernardino.m...@knowledgeworks.pt>>:
> Using Cassandra 2.2.4 on Ubuntu.
> 
> We have a cluster with two nodes that during several hours failed to connect 
> with each other due to network problems. The database continued to be used in 
> one of the nodes with writes being stored in the Hints file as supposed.
> 
> But now that the network is OK again and each machine can communicate we see 
> that each node indicates the other is DOWN and does not replicates.
> 
> When the network came up we started to see in log files "Convicting 
> /192.168.1.102 <http://192.168.1.102/> with status NORMAL - alive false"
> 
> It seems each node evictions each other and later failing to reconnect.
> 
> Is there some configuration that we might be missing ? Any help would be much 
> appreciated.
> 
> 
> 
> - NODE 192.168.1.10 - "nodetool status”
> 
> Datacenter: DC1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens   OwnsHost ID
>Rack
> DN  192.168.1.102  12.02 MB   256  ?   
> ff906882-8224-40ac-8cdb-98f5e725814d  rack1
> Datacenter: DC2
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens   OwnsHost ID
>Rack
> UN  192.168.1.10   41.87 MB   256  ?   
> 51650afd-84dd-4e25-a6f0-13627858d5dc  rack1
> 
> 
> 
> - NODE 192.168.1.102  - “nodetool status"
> 
> Datacenter: DC1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens   OwnsHost ID
>Rack
> UN  192.168.1.102  12.4 MB256  ?   
> ff906882-8224-40ac-8cdb-98f5e725814d  rack1
> Datacenter: DC2
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens   OwnsHost ID
>Rack
> DN  192.168.1.10   26.31 MB   256  ?   
> 51650afd-84dd-4e25-a6f0-13627858d5dc  rack1
> 
> 
>