Hi,

I have a very strange problem with Cassandra bulk loader. Appreciated for
explanations.

I am using a local cassandra server 2.0.5 with default setting.
1. I created a table A and load 108 rows into it by using a hadoop program
with "org.apache.cassandra.hadoop.BulkOutputFormat".
2. I run "truncate A" to remove all the records in cqlsh. Now 0 row is
returned when run "select * from A".
3. I use the same hadoop program to load only the first 12 rows into A.
4. Run "select * from A". Now all 108 rows are back.
5. I stopped the cassandra server by pressing ^c. I removed all files in
the /var/log/cassandra and start the cassandra server again using
"./cassandra -f".
6. I repeated the steps 3-4. All 108 rows are back again
7. In cqlsh, I run "delete from A where A.a='100'". I use the same program
to load the first 12 rows into A. This time, the rows with 'A.a=100' never
appear when I run "select * from A"
8. The rows with "A.a=100" will reappear after I truncate the table and
repeat step 3-4. Still, all 108 rows are back.

Too many strange things here. Every one seems unexplainable.

My local cassandra and hadoop program are both run on a MAC machine.

The table A is defined as:
CREATE TABLE A (
  a text,
  b text,
  value text,
  PRIMARY KEY (a, b)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

Thanks,
Huiliang

Reply via email to