from:"Yuki Morishita \(Commented\) \(JIRA\)"

[jira] [Commented] (CASSANDRA-4174) Unnecessary compaction happens when streaming

2012-04-19 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257852#comment-13257852
 ] 

Yuki Morishita commented on CASSANDRA-4174:
---

bq.  starting compaction as soon as I have one sstable to work on might smooth 
out the workload more.

Current version of cassandra adds sstables and submits compaction when finished 
streaming all files, not when finished 
streaming just one file. In my laptop, I bulkloaded 72 sstables to empty, 
single node cassandra and triggered compaction 9 times without the patch, in 
contrast to 3 times with patch applied.

 Unnecessary compaction happens when streaming
 -

 Key: CASSANDRA-4174
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4174
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.0
Reporter: Yuki Morishita
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.0.10

 Attachments: 4174-1.0.txt


 When streaming session finishes, streamed sstabls are added to CFS one by one 
 using 
 ColumnFamilyStore#addSSTable(https://github.com/apache/cassandra/blob/cassandra-1.0.9/src/java/org/apache/cassandra/streaming/StreamInSession.java#L141).
  This method submits compaction in 
 background(https://github.com/apache/cassandra/blob/cassandra-1.0.9/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L946),
  and end up with unnecessary compaction tasks behind.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4146) sstableloader should detect and report failures

2012-04-16 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254799#comment-13254799
 ] 

Yuki Morishita commented on CASSANDRA-4146:
---

+1

 sstableloader should detect and report failures
 ---

 Key: CASSANDRA-4146
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4146
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 1.0.9
Reporter: Manish Zope
Assignee: Brandon Williams
Priority: Minor
  Labels: sstableloader, tools
 Fix For: 1.1.1

 Attachments: 4146.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 There are three cases where we have observed the abnormal termination
 1) In case of exception while loading.
 2) User terminates the loading process.
 3) If some node is down OR un-reachable then sstableloader get stucked.In 
 this case user have to terminate the process in between.
 In case of abnormal termination, sstables (which are added in this session) 
 remains as it is on the cluster.In case user starts the process all over 
 again by fixing the exception, it results in duplication of the data till 
 Major compaction is triggered.
 sstableloader can maintain the session while loading the sstables in 
 cluster.So in case of abnormal termination sstableloader triggers the event 
 that will delete the sstables loaded in that session.
 Also It would be great to have timeout in case of sstableloader.That can be 
 kept configurable.
 So if sstableloader process got stucked for period longer than timeout, it 
 can terminate itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4045) BOF fails when some nodes are down

2012-04-16 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254861#comment-13254861
 ] 

Yuki Morishita commented on CASSANDRA-4045:
---

+1

 BOF fails when some nodes are down
 --

 Key: CASSANDRA-4045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
  Labels: hadoop
 Fix For: 1.1.1

 Attachments: 4045.txt


 As the summary says, we should allow jobs to complete when some targets are 
 unavailable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4157) Allow KS + CF names up to 48 characters

2012-04-16 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255318#comment-13255318
 ] 

Yuki Morishita commented on CASSANDRA-4157:
---

lgtm.

 Allow KS + CF names up to 48 characters
 ---

 Key: CASSANDRA-4157
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4157
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1.0
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 1.1.0

 Attachments: 4157.txt


 CASSANDRA-2749 imposed a 32-character limit on KS and CF names.  We can be a 
 little more lenient than that and still be safe for path names (see 
 CASSANDRA-4110).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3617) Clean up and optimize Message

2012-04-10 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250928#comment-13250928
 ] 

Yuki Morishita commented on CASSANDRA-3617:
---

Updated patches are at https://github.com/yukim/cassandra/branches/3617-2.

All unit tests/dtests pass.

 Clean up and optimize Message
 -

 Key: CASSANDRA-3617
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3617
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2


 The Message class has grown largely by accretion and it shows.  There are 
 several problems:
 - Outbound and inbound messages aren't really the same thing and should not 
 be conflated
 - We pre-serialize message bodies to byte[], then copy those bytes onto the 
 Socket buffer, instead of just keeping a reference to the object being 
 serialized and then writing it out directly to the socket
 - MessagingService versioning is poorly encapsulating, scattering version 
 variables and references to things like CachingMessageProducer across the 
 codebase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4100) Make scrub and cleanup operations throttled

2012-04-10 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251101#comment-13251101
 ] 

Yuki Morishita commented on CASSANDRA-4100:
---

v3 looks good to me, but as Sylvain said, I'm +1 to put this to version 1.1.1 
instead of 1.0.10.

 Make scrub and cleanup operations throttled
 ---

 Key: CASSANDRA-4100
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4100
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Vijay
Assignee: Vijay
Priority: Minor
  Labels: compaction
 Fix For: 1.0.10

 Attachments: 0001-CASSANDRA-4100-v2.patch, 
 0001-CASSANDRA-4100-v3.patch, 0001-CASSANDRA-4100.patch


 Looks like scrub and cleanup operations are not throttled and it will be nice 
 to throttle else we are likely to run into IO issues while running it on live 
 cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4100) Make scrub and cleanup operations throttled

2012-04-09 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250188#comment-13250188
 ] 

Yuki Morishita commented on CASSANDRA-4100:
---

OK, so static Throttle is fine here since one compaction_throughput_mb_per_sec 
is used for all compactions. Then, do we need to divide that by number of 
active compactions? I'm referring the code inside the implementation of 
ThroughputFunction:

{code}
totalBytesPerMS / Math.max(1, 
CompactionManager.instance.getActiveCompactions());
{code}

 Make scrub and cleanup operations throttled
 ---

 Key: CASSANDRA-4100
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4100
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Vijay
Assignee: Vijay
Priority: Minor
  Labels: compaction
 Fix For: 1.0.10

 Attachments: 0001-CASSANDRA-4100.patch


 Looks like scrub and cleanup operations are not throttled and it will be nice 
 to throttle else we are likely to run into IO issues while running it on live 
 cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3617) Clean up and optimize Message

2012-04-03 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245695#comment-13245695
 ] 

Yuki Morishita commented on CASSANDRA-3617:
---

Still in progress, but rebased, implemented serializedSize for all 
IVersionedSerializer and added unit tests for those. 
Current work is at https://github.com/yukim/cassandra/branches/3617.

CliTest and RemoveTest still fail, but all other unit tests including 
SerializationsTests pass.

 Clean up and optimize Message
 -

 Key: CASSANDRA-3617
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3617
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2


 The Message class has grown largely by accretion and it shows.  There are 
 several problems:
 - Outbound and inbound messages aren't really the same thing and should not 
 be conflated
 - We pre-serialize message bodies to byte[], then copy those bytes onto the 
 Socket buffer, instead of just keeping a reference to the object being 
 serialized and then writing it out directly to the socket
 - MessagingService versioning is poorly encapsulating, scattering version 
 variables and references to things like CachingMessageProducer across the 
 codebase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4078) StackOverflowError when upgrading to 1.0.8 from 0.8.10

2012-03-30 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242817#comment-13242817
 ] 

Yuki Morishita commented on CASSANDRA-4078:
---

I don't have a clue about the cause, but since corrupted files are index column 
families, I think work around is to remove all those corrupted index sstables, 
upgrade C*, then rebuild index using 'nodetool rebuild_index'.

 StackOverflowError when upgrading to 1.0.8 from 0.8.10
 --

 Key: CASSANDRA-4078
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4078
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.10
 Environment: OS: Linux xps.openfin 2.6.35.13-91.fc14.i686 #1 SMP Tue 
 May 3 13:36:36 UTC 2011 i686 i686 i386 GNU/Linux
 Java: JVM vendor/version: Java HotSpot(TM) Server VM/1.6.0_31
Reporter: Wenjun
Assignee: paul cannon
Priority: Critical
 Fix For: 0.8.10

 Attachments: 4078.add-asserts.txt, cassandra.yaml.1.0.8, 
 cassandra.yaml.8.10, keycheck.txt, system.log, system.log.0326, 
 system.log.0326-02


 Hello
 I am trying to upgrade our 1-node setup from 0.8.10 to 1.0.8 and seeing the 
 following exception when starting up 1.0.8.  We have been running 0.8.10 
 without any issues.
  
 Attached is the entire log file during startup of 1.0.8.  There are 2 
 exceptions:
 1. StackOverflowError (line 2599)
 2. InstanceAlreadyExistsException (line 3632)
 I tried run scrub under 0.8.10 first, it did not help.  Also, I tried 
 dropping the column family which caused the exception, it just got the same 
 exceptions from another column family.
 Thanks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3776) Streaming task hangs forever during repair after unexpected connection reset by peer

2012-03-28 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240781#comment-13240781
 ] 

Yuki Morishita commented on CASSANDRA-3776:
---

I was not able to reproduce myself yet, but this should happen when 
FileStreamTask gets Exception.
I would like to fix this with CASSANDRA-4051 which is marked as fix for v1.1.

 Streaming task hangs forever during repair after unexpected connection reset 
 by peer
 

 Key: CASSANDRA-3776
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3776
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.7
 Environment: Windows Server 2008 R2
 Sun Java 7u2 64bit
Reporter: Viktor Jevdokimov
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.0.9


 During streaming (repair) a stream receiving node thrown an exceptions:
 ERROR [Streaming:1] 2012-01-24 10:17:03,828 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[Streaming:1,1,main]
 java.lang.RuntimeException: java.net.SocketException: Connection reset by 
 peer: socket write error
   at 
 org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
   at java.lang.Thread.run(Unknown Source)
 Caused by: java.net.SocketException: Connection reset by peer: socket write 
 error
   at java.net.SocketOutputStream.socketWrite0(Native Method)
   at java.net.SocketOutputStream.socketWrite(Unknown Source)
   at java.net.SocketOutputStream.write(Unknown Source)
   at 
 com.ning.compress.lzf.LZFChunk.writeCompressedHeader(LZFChunk.java:77)
   at 
 com.ning.compress.lzf.ChunkEncoder.encodeAndWriteChunk(ChunkEncoder.java:132)
   at 
 com.ning.compress.lzf.LZFOutputStream.writeCompressedBlock(LZFOutputStream.java:203)
   at com.ning.compress.lzf.LZFOutputStream.write(LZFOutputStream.java:97)
   at 
 org.apache.cassandra.streaming.FileStreamTask.write(FileStreamTask.java:181)
   at 
 org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:145)
   at 
 org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
   ... 3 more
 ERROR [Streaming:1] 2012-01-24 10:17:03,891 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[Streaming:1,1,main]
 java.lang.RuntimeException: java.net.SocketException: Connection reset by 
 peer: socket write error
   at 
 org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
   at java.lang.Thread.run(Unknown Source)
 Caused by: java.net.SocketException: Connection reset by peer: socket write 
 error
   at java.net.SocketOutputStream.socketWrite0(Native Method)
   at java.net.SocketOutputStream.socketWrite(Unknown Source)
   at java.net.SocketOutputStream.write(Unknown Source)
   at 
 com.ning.compress.lzf.LZFChunk.writeCompressedHeader(LZFChunk.java:77)
   at 
 com.ning.compress.lzf.ChunkEncoder.encodeAndWriteChunk(ChunkEncoder.java:132)
   at 
 com.ning.compress.lzf.LZFOutputStream.writeCompressedBlock(LZFOutputStream.java:203)
   at com.ning.compress.lzf.LZFOutputStream.write(LZFOutputStream.java:97)
   at 
 org.apache.cassandra.streaming.FileStreamTask.write(FileStreamTask.java:181)
   at 
 org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:145)
   at 
 org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
   ... 3 more
 After which streaming hanged forever.
 A few seconds later the sending node had an exception (may not be related):
 ERROR [Thread-17224] 2012-01-24 10:17:07,817 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[Thread-17224,5,main]
 java.lang.ArrayIndexOutOfBoundsException
 Other than that, nodes behave normally, communicating each other.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:

[jira] [Commented] (CASSANDRA-4051) Stream sessions can only fail via the FailureDetector

2012-03-27 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239614#comment-13239614
 ] 

Yuki Morishita commented on CASSANDRA-4051:
---

Since CASSANDRA-3216 added IEndpointStateChangeSubscriber and 
IFailureDetectionEventListner to StreamOutSession, we need to keep that 
functionality. I proposed modified version of CASSANDRA-3112 except limiting 
retry part on CASSANDRA-3817, I would like to rebase that patch and add retry, 
so that I can post it here. (I will post it soon.)

 Stream sessions can only fail via the FailureDetector
 -

 Key: CASSANDRA-4051
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4051
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
  Labels: streaming
 Fix For: 1.1.0

 Attachments: 4051.txt


 If for some reason, FileStreamTask itself fails more than the number of retry 
 attempts but gossip continues to work, the stream session will never be 
 closed.  This is unlikely to happen in practice since it requires blocking 
 the storage port from new connections but keeping the existing ones, however 
 for the bulk loader this is especially problematic since it doesn't have 
 access to a failure detector and thus no way of knowing if a session failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4087) Improve out-of-the-box cache settings

2012-03-27 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239849#comment-13239849
 ] 

Yuki Morishita commented on CASSANDRA-4087:
---

+1 with nit: I prefer auto.equalsIgnoreCase(conf.key_cache_size_in_mb) to 
avoid NPE in case someone misconfigures value.

 Improve out-of-the-box cache settings
 -

 Key: CASSANDRA-4087
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4087
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.1.0

 Attachments: CASSANDRA-4087.patch


 The default key cache of 2MB is significantly smaller than = 1.0 (200 rows 
 per CF) and much smaller than most production uses.  How about min(5% of the 
 heap, 100MB)?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4087) Improve out-of-the-box cache settings

2012-03-27 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239856#comment-13239856
 ] 

Yuki Morishita commented on CASSANDRA-4087:
---

You can override (accidentally) by setting empty value in cassandra.yaml. In 
that case, you get null.

 Improve out-of-the-box cache settings
 -

 Key: CASSANDRA-4087
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4087
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.1.0

 Attachments: CASSANDRA-4087.patch


 The default key cache of 2MB is significantly smaller than = 1.0 (200 rows 
 per CF) and much smaller than most production uses.  How about min(5% of the 
 heap, 100MB)?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4080) Cut down on the comparisons needed during shouldPurge and needDeserialize

2012-03-26 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238541#comment-13238541
 ] 

Yuki Morishita commented on CASSANDRA-4080:
---

+1

 Cut down on the comparisons needed during shouldPurge and needDeserialize
 -

 Key: CASSANDRA-4080
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4080
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
  Labels: compaction
 Fix For: 1.1.1


 shouldPurge in particular is still a performance sore point with LCS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4023) Improve BloomFilter deserialization performance

2012-03-26 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238860#comment-13238860
 ] 

Yuki Morishita commented on CASSANDRA-4023:
---

NPE only happens on trunk and that is why I reopened this issue.
Fix is attached as trunk-4023.txt.
v3 attached above is for 1.0 branch and already committed to 1.0 and 1.1 
without any problem.

 Improve BloomFilter deserialization performance
 ---

 Key: CASSANDRA-4023
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4023
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.1
Reporter: Joaquin Casares
Assignee: Yuki Morishita
Priority: Minor
  Labels: datastax_qa
 Fix For: 1.0.9, 1.1.0

 Attachments: 4023.txt, cassandra-1.0-4023-v2.txt, 
 cassandra-1.0-4023-v3.txt, trunk-4023.txt


 The difference of startup times between a 0.8.7 cluster and 1.0.7 cluster 
 with the same amount of data is 4x greater in 1.0.7.
 It seems as though 1.0.7 loads the BloomFilter through a series of reading 
 longs out in a multithreaded process while 0.8.7 reads the entire object.
 Perhaps we should update the new BloomFilter to do reading in batch as well?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4023) Improve BloomFilter deserialization performance

2012-03-23 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237231#comment-13237231
 ] 

Yuki Morishita commented on CASSANDRA-4023:
---

bq. Is this io-reducing something that could also apply to 1.0?

It's for RowIndexEntry which is introduced for v1.2 (CASSANDRA-2319), so it 
only relates to trunk.

 Improve BloomFilter deserialization performance
 ---

 Key: CASSANDRA-4023
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4023
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.1
Reporter: Joaquin Casares
Assignee: Yuki Morishita
Priority: Minor
  Labels: datastax_qa
 Fix For: 1.0.9, 1.1.0

 Attachments: 0001-fix-loading-promoted-row-index.patch, 4023.txt, 
 cassandra-1.0-4023-v2.txt, cassandra-1.0-4023-v3.txt


 The difference of startup times between a 0.8.7 cluster and 1.0.7 cluster 
 with the same amount of data is 4x greater in 1.0.7.
 It seems as though 1.0.7 loads the BloomFilter through a series of reading 
 longs out in a multithreaded process while 0.8.7 reads the entire object.
 Perhaps we should update the new BloomFilter to do reading in batch as well?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4022) Compaction of hints can get stuck in a loop

2012-03-15 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230238#comment-13230238
 ] 

Yuki Morishita commented on CASSANDRA-4022:
---

I understand the situation, but isn't it covered by just checking key overlap?
If there is no overlap, then tombstones in target sstable are guaranteed to be 
the only and the newest ones?

 Compaction of hints can get stuck in a loop
 ---

 Key: CASSANDRA-4022
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4022
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2
Reporter: Brandon Williams
Assignee: Yuki Morishita
Priority: Critical
 Fix For: 1.2

 Attachments: 4022.txt


 Not exactly sure how I caused this as I was working on something else in 
 trunk, but:
 {noformat}
  INFO 17:41:35,682 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-339-Data.db')]
  INFO 17:41:36,430 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-340-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 5.912220MB/s.  Time: 748ms.
  INFO 17:41:36,431 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-340-Data.db')]
  INFO 17:41:37,238 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-341-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 5.479976MB/s.  Time: 807ms.
  INFO 17:41:37,239 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-341-Data.db')]
  INFO 17:41:38,163 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-342-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 4.786083MB/s.  Time: 924ms.
  INFO 17:41:38,164 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-342-Data.db')]
  INFO 17:41:39,014 GC for ParNew: 274 ms for 1 collections, 541261288 used; 
 max is 1024458752
  INFO 17:41:39,151 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-343-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 4.485132MB/s.  Time: 986ms.
  INFO 17:41:39,151 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-343-Data.db')]
  INFO 17:41:40,016 GC for ParNew: 308 ms for 1 collections, 585582200 used; 
 max is 1024458752
  INFO 17:41:40,200 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-344-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 4.223821MB/s.  Time: 1,047ms.
  INFO 17:41:40,201 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-344-Data.db')]
  INFO 17:41:41,017 GC for ParNew: 252 ms for 1 collections, 617877904 used; 
 max is 1024458752
  INFO 17:41:41,178 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-345-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 4.526449MB/s.  Time: 977ms.
  INFO 17:41:41,179 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-345-Data.db')]
  INFO 17:41:41,885 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-346-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 6.263938MB/s.  Time: 706ms.
  INFO 17:41:41,887 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-346-Data.db')]
  INFO 17:41:42,617 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-347-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes for 1 keys at 
 6.066311MB/s.  Time: 729ms.
  INFO 17:41:42,618 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-347-Data.db')]
  INFO 17:41:43,376 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-348-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes for 1 keys at 
 5.834222MB/s.  Time: 758ms.
  INFO 17:41:43,377 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-348-Data.db')]
  INFO 17:41:44,307 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-349-Data.db,].
   4,637,160 to 4,637,160 (~100% of original)

[jira] [Commented] (CASSANDRA-4022) Compaction of hints can get stuck in a loop

2012-03-14 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229548#comment-13229548
 ] 

Yuki Morishita commented on CASSANDRA-4022:
---

I tried to use timestamp to determine whether sstable should be compacted, but 
it does not guarantee to suppress tombstones. Tombstones only get dropped when 
those keys don't appear in other sstables besides compacting ones. Currently I 
think the only way to stop compaction loop is to make sure interested sstable 
does not have overlap so its tombstones actually drop.

 Compaction of hints can get stuck in a loop
 ---

 Key: CASSANDRA-4022
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4022
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Yuki Morishita
Priority: Critical
 Fix For: 1.2

 Attachments: 4022.txt


 Not exactly sure how I caused this as I was working on something else in 
 trunk, but:
 {noformat}
  INFO 17:41:35,682 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-339-Data.db')]
  INFO 17:41:36,430 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-340-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 5.912220MB/s.  Time: 748ms.
  INFO 17:41:36,431 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-340-Data.db')]
  INFO 17:41:37,238 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-341-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 5.479976MB/s.  Time: 807ms.
  INFO 17:41:37,239 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-341-Data.db')]
  INFO 17:41:38,163 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-342-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 4.786083MB/s.  Time: 924ms.
  INFO 17:41:38,164 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-342-Data.db')]
  INFO 17:41:39,014 GC for ParNew: 274 ms for 1 collections, 541261288 used; 
 max is 1024458752
  INFO 17:41:39,151 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-343-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 4.485132MB/s.  Time: 986ms.
  INFO 17:41:39,151 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-343-Data.db')]
  INFO 17:41:40,016 GC for ParNew: 308 ms for 1 collections, 585582200 used; 
 max is 1024458752
  INFO 17:41:40,200 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-344-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 4.223821MB/s.  Time: 1,047ms.
  INFO 17:41:40,201 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-344-Data.db')]
  INFO 17:41:41,017 GC for ParNew: 252 ms for 1 collections, 617877904 used; 
 max is 1024458752
  INFO 17:41:41,178 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-345-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 4.526449MB/s.  Time: 977ms.
  INFO 17:41:41,179 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-345-Data.db')]
  INFO 17:41:41,885 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-346-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes 
 for 1 keys at 6.263938MB/s.  Time: 706ms.
  INFO 17:41:41,887 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-346-Data.db')]
  INFO 17:41:42,617 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-347-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes for 1 keys at 
 6.066311MB/s.  Time: 729ms.
  INFO 17:41:42,618 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-347-Data.db')]
  INFO 17:41:43,376 Compacted to 
 [/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-348-Data.db,].
   4,637,160 to 4,637,160 (~100% of original) bytes for 1 keys at 
 5.834222MB/s.  Time: 758ms.
  INFO 17:41:43,377 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hd-348-Data.db')]
  INFO

[jira] [Commented] (CASSANDRA-4031) Exceptions during inserting emtpy string as column value on indexed column

2012-03-12 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228015#comment-13228015
 ] 

Yuki Morishita commented on CASSANDRA-4031:
---

Same error happens when performing searching for empty value(='') on indexed 
column.

I agree with Sylvain, empty row key should not be allowed. But for version 1.1, 
I think it is fine to use empty row key on secondary indices, otherwise we have 
to perform full data scan and filter out all that have empty value on indexed 
column(or refuse query which has ='').

I will fix this by adding empty key DK only for secondary index.

 Exceptions during inserting emtpy string as column value on indexed column
 --

 Key: CASSANDRA-4031
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4031
 Project: Cassandra
  Issue Type: Bug
Reporter: Mariusz
Assignee: Yuki Morishita

 Hi,
 I`m running one node cluster(issue occurs also on other cluster(which has 2 
 nodes)) on snapshot from cassandra-1.1 branch(i used 
 449e037195c3c504d7aca5088e8bc7bd5a50e7d0 commit).
 i have simple CF, definition of TestCF:
 {noformat}
 [default@test_keyspace] describe Test_CF;
 ColumnFamily: Test_CF
   Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
   Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
   Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   DC Local Read repair chance: 0.0
   Replicate on write: true
   Caching: KEYS_ONLY
   Bloom Filter FP chance: default
   Built indexes: [Test_CF.Test_CF_test_index_idx]
   Column Metadata:
 Column Name: test_index
   Validation Class: org.apache.cassandra.db.marshal.UTF8Type
   Index Name: Test_CF_test_index_idx
   Index Type: KEYS
   Compaction Strategy: 
 org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
   Compression Options:
 sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
 {noformat}
 I`m trying to add new row(log from cassandra-cli, note that there is index on 
 test_index):
 {noformat}
 [default@test_keyspace] list Test_CF;  
 Using default limit of 100
 0 Row Returned.
 Elapsed time: 31 msec(s).
 [default@test_keyspace] set Test_CF[absdsad3][test_index]='';
 null
 TimedOutException()
   at 
 org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:15906)
   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:788)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:772)
   at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:894)
   at 
 org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:211)
   at 
 org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:219)
   at org.apache.cassandra.cli.CliMain.main(CliMain.java:346)
 [default@test_keyspace] list Test_CF;
 Using default limit of 100
 ---
 RowKey: absdsad3
 = (column=test_index, value=, timestamp=1331298173009000)
 1 Row Returned.
 Elapsed time: 7 msec(s).
 {noformat}
 Exception from system.log:
 {noformat}
  INFO [FlushWriter:56] 2012-03-09 13:42:02,500 Memtable.java (line 291) 
 Completed flushing 
 /var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hc-3251-Data.db
  (2077 bytes)
 ERROR [MutationStage:2291] 2012-03-09 13:42:22,232 
 AbstractCassandraDaemon.java (line 134) Exception in thread 
 Thread[MutationStage:2291,5,main]
 java.lang.AssertionError
 at org.apache.cassandra.db.DecoratedKey.init(DecoratedKey.java:55)
 at 
 org.apache.cassandra.db.index.SecondaryIndexManager.getIndexKeyFor(SecondaryIndexManager.java:294)
 at 
 org.apache.cassandra.db.index.SecondaryIndexManager.applyIndexUpdates(SecondaryIndexManager.java:490)
 at org.apache.cassandra.db.Table.apply(Table.java:441)
 at org.apache.cassandra.db.Table.apply(Table.java:366)
 at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:275)
 at 
 org.apache.cassandra.service.StorageProxy$6.runMayThrow(StorageProxy.java:446)
 at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1228)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically

[jira] [Commented] (CASSANDRA-3442) TTL histogram for sstable metadata

2012-02-28 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218762#comment-13218762
 ] 

Yuki Morishita commented on CASSANDRA-3442:
---

{quote}
I switched from checking instanceof ExipiringColumn, to instanceof 
DeletedColumn, ...
{quote}

I don't see this switch in your patch, but as you described, tracking 
tombstones makes more sense.

I think it's doable to create histogram of dropping time when writing sstable 
and use that for single sstable compaction.
So we can trigger compaction on single sstable which contains say 20% of 
ExpiringColumns and DeletedColumns and 50% of them can be dropped.

{quote}
Minor note: the new test seems fairly involved – what would we lose by just 
testing compaction of a single sstable w/ tombstones?
{quote}

Well, nothing :)
single sstable compaction test is fine.

 TTL histogram for sstable metadata
 --

 Key: CASSANDRA-3442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3442
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
Priority: Minor
  Labels: compaction
 Fix For: 1.2

 Attachments: 3442-v3.txt, cassandra-1.1-3442.txt


 Under size-tiered compaction, you can generate large sstables that compact 
 infrequently.  With expiring columns mixed in, we could waste a lot of space 
 in this situation.
 If we kept a TTL EstimatedHistogram in the sstable metadata, we could do a 
 single-sstable compaction aginst sstables with over 20% (?) expired data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2963) Add a convenient way to reset a node's schema

2012-02-21 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13212716#comment-13212716
 ] 

Yuki Morishita commented on CASSANDRA-2963:
---

lgtm.

 Add a convenient way to reset a node's schema
 -

 Key: CASSANDRA-2963
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2963
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Brandon Williams
Assignee: Yuki Morishita
Priority: Minor
  Labels: lhf
 Fix For: 1.1.1

 Attachments: CASSANDRA-2963-v3.patch, cassandra-1.1-2963-v2.txt, 
 cassandra-1.1-2963.txt, system_reset_schema.txt


 People often encounter a schema disagreement where just one node is out of 
 sync.  To get it back in sync, they shutdown the node, move the Schema* and 
 Migration* files out of the system ks, and then start it back up.  Rather 
 than go through this process, it would be nice if you could just tell the 
 node to reset its schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3908) Bootstrapping node stalls. Bootstrapper thinks it is still streaming some sstables. The source nodes do not. Caused by IllegalStateException on source nodes.

2012-02-21 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13212722#comment-13212722
 ] 

Yuki Morishita commented on CASSANDRA-3908:
---

Dominic,

Can you reproduce this with DEBUG logging?
When this happened, did bootstrapping node receive any data? (Are sstable files 
created inside your data directory?)

 Bootstrapping node stalls. Bootstrapper thinks it is still streaming some 
 sstables. The source nodes do not. Caused by IllegalStateException on source 
 nodes.
 -

 Key: CASSANDRA-3908
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3908
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.7
 Environment: Ubuntu
Reporter: Dominic Williams
  Labels: bootstrap, repair, streaming, streams
 Fix For: 1.0.8

   Original Estimate: 24h
  Remaining Estimate: 24h

 This problem looks like 2792
 I am bootstrapping a new node into my cluster.
 There are two keyspaces FightMyMonster and FMM_Studio. The first keyspace 
 successfully streams and the whole operation is probably at 99%+ when it 
 stalls on some sstables in the much smaller FMM_Studio keyspace.
 Netstats on the bootstrapping node reports it is still streaming:
 root:/var/lib/cassandra/data# nodetool -h localhost netstats
 Mode: JOINING
 Not sending any streams.
 Streaming from: /192.168.1.9
FMM_Studio: 
 /var/lib/cassandra/data/FMM_Studio/AuthorClasses-hc-134-Data.db sections=1 
 progress=0/160 - 0%
FMM_Studio: 
 /var/lib/cassandra/data/FMM_Studio/AuthorClasses-hc-132-Data.db sections=1 
 progress=0/4422 - 0%
FMM_Studio: /var/lib/cassandra/data/FMM_Studio/PartsData-hc-149-Data.db 
 sections=1 progress=0/6158642 - 0%
 Streaming from: /192.168.1.4
FMM_Studio: /var/lib/cassandra/data/FMM_Studio/PartsData-hc-201-Data.db 
 sections=1 progress=0/50172 - 0%
FMM_Studio: /var/lib/cassandra/data/FMM_Studio/PartsData-hc-199-Data.db 
 sections=1 progress=0/5140877 - 0%
FMM_Studio: /var/lib/cassandra/data/FMM_Studio/PartsData-hc-202-Data.db 
 sections=1 progress=0/147346 - 0%
FMM_Studio: /var/lib/cassandra/data/FMM_Studio/Studio-hc-86-Data.db 
 sections=1 progress=0/2014 - 0%
 Pool NameActive   Pending  Completed
 Commandsn/a 0478
 Responses   n/a 0 496302
 However, running netstats on the source nodes reports they are not streaming:
 root:~# nodetool -h localhost netstats
 Mode: NORMAL
  Nothing streaming to /192.168.1.11
 Not receiving any streams.
 Pool NameActive   Pending  Completed
 Commandsn/a 0   13291116
 Responses   n/a 08334754
 Examination of the logs on the source nodes does NOT show an error for the 
 specific sstables that are stalled. The starting of streaming is duly logged:
 pStage:1] 2012-02-14 01:40:58,746 Gossiper.java (line 804) InetAddress 
 /192.168.1.11 is now UP
  INFO [StreamStage:1] 2012-02-14 01:41:26,765 StreamOut.java (line 114) 
 Beginning transfer to /192.168.1.11
  INFO [StreamStage:1] 2012-02-14 01:41:26,765 StreamOut.java (line 95) 
 Flushing memtables for [CFS(Keyspace='FMM_Studio', ColumnFamily='Classes'), 
 CFS(Keyspace='FMM_Studio', ColumnFamily='Part
 sData'), CFS(Keyspace='FMM_Studio', ColumnFamily='Studio'), 
 CFS(Keyspace='FMM_Studio', ColumnFamily='AuthorClasses')]...
  INFO [StreamStage:1] 2012-02-14 01:41:26,825 StreamOut.java (line 160) 
 Stream context metadata 
 [/var/lib/cassandra/data/FMM_Studio/Classes-hc-144-Data.db sections=1 
 progress=0/2460670 - 0%, /
 var/lib/cassandra/data/FMM_Studio/PartsData-hc-149-Data.db sections=1 
 progress=0/6158642 - 0%, 
 /var/lib/cassandra/data/FMM_Studio/AuthorClasses-hc-134-Data.db sections=1 
 progress=0/160 - 0%, /
 var/lib/cassandra/data/FMM_Studio/AuthorClasses-hc-132-Data.db sections=1 
 progress=0/4422 - 0%], 6 sstables.
  INFO [StreamStage:1] 2012-02-14 01:41:26,825 StreamOutSession.java (line 
 203) Streaming to /192.168.1.11
  INFO [StreamStage:1] 2012-02-14 01:41:26,835 StreamOut.java (line 114) 
 Beginning transfer to /192.168.1.11
 There does however appear to have been an IllegalStateException for another 
 sstable in this keyspace (which occurs a second or so after streaming has 
 begun). Perhaps this broke the streaming...
 ERROR [MiscStage:1] 2012-02-14 01:41:27,235 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[MiscStage:1,5,main]
 java.lang.IllegalStateException: target reports current file is 
 /var/lib/cassandra/data/FMM_Studio/Classes-hc-144-Data.db but is null

[jira] [Commented] (CASSANDRA-3712) Can't cleanup after I moved a token.

2012-02-13 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207021#comment-13207021
 ] 

Yuki Morishita commented on CASSANDRA-3712:
---

I ran my unit test enough and I see no error.
+1

 Can't cleanup after I moved a token.
 

 Key: CASSANDRA-3712
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3712
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.6.0_26
 Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
 Ubuntu 10.04.2 LTS 64-Bit
 RAM: 2GB / 1GB free
 Data partition: 80% free on the most used server.
Reporter: Herve Nicol
Assignee: Yuki Morishita
 Fix For: 1.0.8

 Attachments: 0001-Add-flush-and-cleanup-race-test.patch, 
 0002-Acquire-lock-when-updating-index.patch, 3712-v3.txt


 Before cleanup failed, I moved one node's token.
 My cluster had 10GB data on 2 nodes. Data repartition was bad, tokens were 
 165[...] and 155[...].
 I moved 155 to 075[...], then adjusted to 076[...]. The moves were correctly 
 processed, with no exception.
 But then, when I wanted to cleanup, it failed and keeps failing, on both 
 nodes.
 Other maintenance procedures like repair, compact or scrub work.
 All the data is in the URLs CF.
 Example session log:
 nodetool cleanup fails:
 $ ./nodetool --host cnode1 cleanup
 Error occured during cleanup
 java.util.concurrent.ExecutionException: java.lang.AssertionError
  at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
  at java.util.concurrent.FutureTask.get(FutureTask.java:83)
  at 
 org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:203)
  at 
 org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:237)
  at 
 org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:958)
  at 
 org.apache.cassandra.service.StorageService.forceTableCleanup(StorageService.java:1604)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
  at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
  at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
  at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
  at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
  at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
  at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
  at sun.rmi.transport.Transport$1.run(Transport.java:159)
  at java.security.AccessController.doPrivileged(Native Method)
  at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
  at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
  at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
  at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.AssertionError
  at org.apache.cassandra.db.Memtable.put(Memtable.java:136)
  at 
 org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:780)
  at 
 org.apache.cassandra.db.index.keys.KeysIndex.deleteColumn(KeysIndex.java:82)
  at

[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner

2012-02-13 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207132#comment-13207132
 ] 

Yuki Morishita commented on CASSANDRA-3772:
---

Dave,

Patch needs rebase, but looking at the patch, I noticed the following:

{code}
private static byte[] hashMurmur3(ByteBuffer... data)
{
HashFunction hashFunction = murmur3HF.get();
Hasher hasher = hashFunction.newHasher();
// snip
}
{code}

Isn't that slow if you instantiate every time? I looked up guava source code 
but I saw no way to reset, so I guess the above is the only thing you could 
do...

I also note that CASSANDRA-2975 will implement MurmurHash3, so I think it is 
better not to introduce external library. What do you think?

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Dave Brosius
 Fix For: 1.2

 Attachments: try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3712) Can't cleanup after I moved a token.

2012-02-10 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205631#comment-13205631
 ] 

Yuki Morishita commented on CASSANDRA-3712:
---

bq. I can't get the new test to fail after a dozen tries. If there isn't a way 
to make it more robust (say, with explicit sleeps) maybe we should just leave 
that out.

In my env, it fails 1/3 or 1/4 try. I cannot think of better test program, so 
you can leave it out.

bq. Currently the switch locking is done by the callers of the SIM methods, 
i.e., Table.apply and Table.indexRow. Locking at the column level is not 
sufficient there, but doing it in both places is redundant. So maybe the right 
place to lock here would be in the doCleanupCompaction method.

You are right. Previous patch acquires lock too often. I placed lock/unlock 
inside doCleanupCompaction in newer patch.
In order to do that, I have to expose Table.switchlock to public, but I don't 
know if that is the right way.

 Can't cleanup after I moved a token.
 

 Key: CASSANDRA-3712
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3712
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.6.0_26
 Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
 Ubuntu 10.04.2 LTS 64-Bit
 RAM: 2GB / 1GB free
 Data partition: 80% free on the most used server.
Reporter: Herve Nicol
Assignee: Yuki Morishita
 Fix For: 1.0.8

 Attachments: 0001-Add-flush-and-cleanup-race-test.patch, 
 0002-Acquire-lock-when-updating-index.patch


 Before cleanup failed, I moved one node's token.
 My cluster had 10GB data on 2 nodes. Data repartition was bad, tokens were 
 165[...] and 155[...].
 I moved 155 to 075[...], then adjusted to 076[...]. The moves were correctly 
 processed, with no exception.
 But then, when I wanted to cleanup, it failed and keeps failing, on both 
 nodes.
 Other maintenance procedures like repair, compact or scrub work.
 All the data is in the URLs CF.
 Example session log:
 nodetool cleanup fails:
 $ ./nodetool --host cnode1 cleanup
 Error occured during cleanup
 java.util.concurrent.ExecutionException: java.lang.AssertionError
  at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
  at java.util.concurrent.FutureTask.get(FutureTask.java:83)
  at 
 org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:203)
  at 
 org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:237)
  at 
 org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:958)
  at 
 org.apache.cassandra.service.StorageService.forceTableCleanup(StorageService.java:1604)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
  at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
  at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
  at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
  at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
  at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
  at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
  at sun.rmi.transport.Transport$1.run(Transport.java:159)
  at java.security.AccessController.doPrivileged(Native Method)
  at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
  at

[jira] [Commented] (CASSANDRA-3442) TTL histogram for sstable metadata

2012-02-10 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205640#comment-13205640
 ] 

Yuki Morishita commented on CASSANDRA-3442:
---

Patch also can apply to trunk at this moment.

 TTL histogram for sstable metadata
 --

 Key: CASSANDRA-3442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3442
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
Priority: Minor
  Labels: compaction
 Fix For: 1.2

 Attachments: cassandra-1.1-3442.txt


 Under size-tiered compaction, you can generate large sstables that compact 
 infrequently.  With expiring columns mixed in, we could waste a lot of space 
 in this situation.
 If we kept a TTL EstimatedHistogram in the sstable metadata, we could do a 
 single-sstable compaction aginst sstables with over 20% (?) expired data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3865) Cassandra-cli returns 'command not found' instead of syntax error

2012-02-07 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13202428#comment-13202428
 ] 

Yuki Morishita commented on CASSANDRA-3865:
---

This is fixed in 1.0.7 (CASSANDRA-3714).

 Cassandra-cli returns 'command not found' instead of syntax error
 -

 Key: CASSANDRA-3865
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3865
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 1.0.5
Reporter: Eric Lubow
Assignee: Yuki Morishita
Priority: Trivial
  Labels: cassandra-cli

 When creating a column family from the output of 'show schema' with an index, 
 there is a trailing comma after index_type: 0,  The return from this is a 
 'command not found'  This is misleading because the command is found, there 
 is just a syntax error.
 'Command not found: `create column family $cfname ...`

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3821) Counters in super columns don't preserve correct values after cluster restart

2012-02-06 Thread Yuki Morishita (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13201768#comment-13201768
]

Yuki Morishita commented on CASSANDRA-3821:
---

Here is my initial look at the issue (might be wrong):

Concurrent counter mutation replay from commitlog and AtomicSortedColumns
inside Memtable seem to be the cause of over count.
There is a race condition when adding column to memtable, and when it happens
AtomicSortedColumns calls {{{IColumn#reconcile}}} multiple times until column
is stored. It causes over count since counter column's {{reconcile}} is not
idempotent operation.

Counters in super columns don't preserve correct values after cluster restart
-

Key: CASSANDRA-3821
URL: https://issues.apache.org/jira/browse/CASSANDRA-3821
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 1.1
Environment: ubuntu, 'trunk' branch, used ccm to create a 3 node
cluster with rf=3. A dtest was created to demonstrate.
Reporter: Tyler Patterson

Set up a 3-node cluster with rf=3. Create a counter super column family and
increment a bunch of subcolumns 100 times each, with cf=QUORUM. Then wait a
few second, restart the cluster, and read the values back. They almost all
come back different (and higher) then they are supposed to be.
Here are some extra things I've noticed:
- Reading back the values before the restart always produces correct results.
- Doing a nodetool flush before killing the cluster greatly improves the
results, though sometimes a value will still be incorrect. You might have to
run the test several times to see an incorrect value after a flush.
- This problem doesn't happen on C* 1.0.7, unless you don't sleep between
doing the increments and killing the cluster. Then it sometimes happens to a
lesser degree.
The dtest that demonstrates this issue is called super_counter_test.py. Run
it like this: nosetests --nocapture super_counter_test.py You'll need ccm
from g...@github.com:tpatterson/ccm.git.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2012-01-27 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195229#comment-13195229
 ] 

Yuki Morishita commented on CASSANDRA-3623:
---

Vijay, Pavel,

I did the test similar to Pavel's on physical machine (4core 2.6GHz Xeon/16GB 
RAM/Linux(debian)) with trunk + 3623(v3) + 3610(v3).
Cassandra is run on following jvm.

{code}
$ java -version
java version 1.6.0_26
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
{code}

with jvm args:

{code}
-ea
-javaagent:bin/../lib/jamm-0.2.5.jar
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42
-Xms6G -Xmx6G -Xmn2G -Xss128k
-XX:+HeapDumpOnOutOfMemoryError
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled 
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 
-XX:+UseCMSInitiatingOccupancyOnly
-Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false
-Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true
{code}

Populate enough data with stress tool, set crc_check_chance to 0.0, flush and 
compact.
Befor each test run, clean page cache. Stress tool is run from another machine.

* data_access_mode: mmap

{code}
$ tools/stress/bin/stress -n 50 -S 1024 -I SnappyCompressor -o read -d node0
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
27487,2748,2748,0.01813206242951213,10
65226,3773,3773,0.013355361827287422,20
103145,3791,3791,0.01334416372171528,30
141092,3794,3794,0.013307842310530199,40
178981,3788,3788,0.013323840692549289,50
217062,3808,3808,0.013260129723484152,60
255020,3795,3795,0.01330330892038569,70
293075,3805,3805,0.013265825778478518,80
331046,3797,3797,0.013295910036606884,91
369059,3801,3801,0.01328353458027517,101
i407030,3797,3797,0.01329540965473651,111
444920,3789,3789,0.013323251517550806,121
482894,3797,3797,0.013299231052825617,131
50,1710,1710,0.010978779375657664,136
END
{code}

* data_access_mode: standard

{code}
$ tools/stress/bin/stress -n 50 -S 1024 -I SnappyCompressor -o read -d node0
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
25474,2547,2547,0.019527989322446416,10
117046,9157,9157,0.005506617743415018,20
211863,9481,9481,0.005313298248204436,30
306773,9491,9491,0.005311305447265831,40
401107,9433,9433,0.005341160133143935,50
496051,9494,9494,0.005200739383215369,60
50,394,394,0.0019680931881488986,61
END
{code}

I ran the above several times (making sure each test is isolated), for each 
iteration I observe about the same result.

Things I noticed when digging with VisualVM

- Snappy uncompression with direct bytebuffers seems slightly faster, but its 
impact to overall read performace is negligible.
- I observed that CompressedMappedFileDataInput.reBuffer is called many times 
especially from the path CMFDI.reset - CMFDI.seek - CMFDI.reBuffer.
- When using CMFDI, I observe higher cpu usage than CRAR over all.

Right now I cannot find the reason to use mmapped bytebuffers for compressed 
files.


 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file-v3.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v3.patch, CRC+MMapIO.xlsx, 
 MMappedIO-Performance.docx


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3743) Lower memory consumption used by index sampling

2012-01-20 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189912#comment-13189912
 ] 

Yuki Morishita commented on CASSANDRA-3743:
---

new patch attached lgtm.

 Lower memory consumption used by index sampling
 ---

 Key: CASSANDRA-3743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3743
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.0
Reporter: Radim Kolar
Assignee: Radim Kolar
  Labels: optimization
 Fix For: 1.1

 Attachments: cassandra-3743-codestyle.txt, cassandra-3743.txt


 currently j.o.a.c.io.sstable.indexsummary is implemented as ArrayList of 
 KeyPosition (RowPosition key, long offset)i propose to change it to:
 RowPosition keys[]
 long offsets[]
 and use standard binary search on it. This will lower number of java objects 
 used per entry from 2 (KeyPosition + RowPosition) to 1 (RowPosition).
 For building these arrays convenient ArrayList class can be used and then 
 call to .toArray() on it.
 This is very important because index sampling uses a lot of memory on nodes 
 with billions rows

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3743) Lower memory consumption used by index sampling

2012-01-20 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189978#comment-13189978
 ] 

Yuki Morishita commented on CASSANDRA-3743:
---

I thought patch is for 1.0 branch.
Is it going to trunk?

 Lower memory consumption used by index sampling
 ---

 Key: CASSANDRA-3743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3743
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.0
Reporter: Radim Kolar
Assignee: Radim Kolar
  Labels: optimization
 Fix For: 1.1

 Attachments: cassandra-3743-codestyle.txt, cassandra-3743.txt


 currently j.o.a.c.io.sstable.indexsummary is implemented as ArrayList of 
 KeyPosition (RowPosition key, long offset)i propose to change it to:
 RowPosition keys[]
 long offsets[]
 and use standard binary search on it. This will lower number of java objects 
 used per entry from 2 (KeyPosition + RowPosition) to 1 (RowPosition).
 For building these arrays convenient ArrayList class can be used and then 
 call to .toArray() on it.
 This is very important because index sampling uses a lot of memory on nodes 
 with billions rows

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3738) sstable2json doesn't work for secondary index sstable due to partitioner mismatch

2012-01-20 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190191#comment-13190191
 ] 

Yuki Morishita commented on CASSANDRA-3738:
---

v2 is much better.
I tested with it and worked as expected.
+1.

 sstable2json doesn't work for secondary index sstable due to partitioner 
 mismatch
 -

 Key: CASSANDRA-3738
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3738
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.6
 Environment: linux
Reporter: Shotaro Kamio
Assignee: Yuki Morishita
  Labels: tools
 Fix For: 1.0.8

 Attachments: 3738-v2.txt, cassandra-1.0-3738.txt


 sstable2json doesn't work for secondary index sstable in 1.0.6 while it 
 worked in version 0.8.x.
 $ bin/sstable2json $DATA/data/Keyspace1/users-hc-1-Data.db 
 {
 : [[birth_year,1973,1326450301786000], [full_name,Patrick 
 Rothfuss,1326450301782000]],
 1020: [[birth_year,1975,1326450301776000], [full_name,Brandon 
 Sanderson,1326450301716000]]
 }
 $ bin/sstable2json 
 $DATA/data/Keyspace1/users.users_birth_year_idx-hc-1-Data.db 
 Exception in thread main java.lang.RuntimeException: Cannot open 
 data/Keyspace1/users.users_birth_year_idx-hc-1 because partitioner does not 
 match org.apache.cassandra.dht.RandomPartitioner
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:145)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:123)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:118)
   at 
 org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:360)
   at 
 org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:373)
   at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:431)
 I tested with following sample data via cli:
 create keyspace Keyspace1;
 use Keyspace1;
 create column family users with comparator=UTF8Type and
  column_metadata=[{column_name: full_name, validation_class: UTF8Type},
 {column_name: email, validation_class: UTF8Type},
   {column_name: birth_year, validation_class: LongType, index_type: KEYS},
   {column_name: state, validation_class:  UTF8Type, index_type: KEYS}];
 set users[1020][full_name] = 'Brandon Sanderson';
 set users[1020][birth_year] = 1975;  
 set users[][full_name] = 'Patrick Rothfuss'; 
 set users[][birth_year] = 1973;
 get users where birth_year = 1973;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3743) Lower memory consumption used by index sampling

2012-01-19 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189286#comment-13189286
 ] 

Yuki Morishita commented on CASSANDRA-3743:
---

+1 except coding style:

- you should use 4 spaces instead of tab
- always place { and } on new line

 Lower memory consumption used by index sampling
 ---

 Key: CASSANDRA-3743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3743
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.0
Reporter: Radim Kolar
Assignee: Radim Kolar
  Labels: optimization
 Fix For: 1.1

 Attachments: cassandra-3743.txt


 currently j.o.a.c.io.sstable.indexsummary is implemented as ArrayList of 
 KeyPosition (RowPosition key, long offset)i propose to change it to:
 RowPosition keys[]
 long offsets[]
 and use standard binary search on it. This will lower number of java objects 
 used per entry from 2 (KeyPosition + RowPosition) to 1 (RowPosition).
 For building these arrays convenient ArrayList class can be used and then 
 call to .toArray() on it.
 This is very important because index sampling uses a lot of memory on nodes 
 with billions rows

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3738) sstable2json doesn't work for secondary index sstable due to partitioner mismatch

2012-01-13 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185689#comment-13185689
 ] 

Yuki Morishita commented on CASSANDRA-3738:
---

In what version do you use to create sstable? If that's 1.0.4, secondary index 
is created in wrong way.(CASSANDRA-3540)
In that case, you have to drop index and rebuild index first.

 sstable2json doesn't work for secondary index sstable due to partitioner 
 mismatch
 -

 Key: CASSANDRA-3738
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3738
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.6
 Environment: linux
Reporter: Shotaro Kamio

 sstable2json doesn't work for secondary index sstable in 1.0.6 while it 
 worked in version 0.8.x.
 $ bin/sstable2json $DATA/data/Keyspace1/users-hc-1-Data.db 
 {
 : [[birth_year,1973,1326450301786000], [full_name,Patrick 
 Rothfuss,1326450301782000]],
 1020: [[birth_year,1975,1326450301776000], [full_name,Brandon 
 Sanderson,1326450301716000]]
 }
 $ bin/sstable2json 
 $DATA/data/Keyspace1/users.users_birth_year_idx-hc-1-Data.db 
 Exception in thread main java.lang.RuntimeException: Cannot open 
 data/Keyspace1/users.users_birth_year_idx-hc-1 because partitioner does not 
 match org.apache.cassandra.dht.RandomPartitioner
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:145)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:123)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:118)
   at 
 org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:360)
   at 
 org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:373)
   at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:431)
 I tested with following sample data via cli:
 create keyspace Keyspace1;
 use Keyspace1;
 create column family users with comparator=UTF8Type and
  column_metadata=[{column_name: full_name, validation_class: UTF8Type},
 {column_name: email, validation_class: UTF8Type},
   {column_name: birth_year, validation_class: LongType, index_type: KEYS},
   {column_name: state, validation_class:  UTF8Type, index_type: KEYS}];
 set users[1020][full_name] = 'Brandon Sanderson';
 set users[1020][birth_year] = 1975;  
 set users[][full_name] = 'Patrick Rothfuss'; 
 set users[][birth_year] = 1973;
 get users where birth_year = 1973;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3715) Throw error when creating indexes with the same name as other CFs

2012-01-11 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184192#comment-13184192
 ] 

Yuki Morishita commented on CASSANDRA-3715:
---

I tried to reproduce this in 1.0.6 and cassandra-1.0 branch, but both display 
message correctly.
Looks like this is fixed in 1.0.6 by the patch to CASSANDRA-3573.

 Throw error when creating indexes with the same name as other CFs
 -

 Key: CASSANDRA-3715
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3715
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.5
Reporter: Joaquin Casares
Assignee: Yuki Morishita

 0.8.8 throws: InvalidRequestException(why:Duplicate index name path)
 but 1.0.5 displays: null
 when running this:
 {noformat}
 create column family inode
   with column_type = 'Standard'
   and comparator = 
 'DynamicCompositeType(t=org.apache.cassandra.db.marshal.TimeUUIDType,s=org.apache.cassandra.db.marshal.UTF8Type,b=org.apache.cassandra.db.marshal.BytesType)'
   and default_validation_class = 'BytesType'
   and key_validation_class = 'BytesType'
   and rows_cached = 0.0
   and row_cache_save_period = 0
   and row_cache_keys_to_save = 2147483647
   and keys_cached = 100.0
   and key_cache_save_period = 14400
   and read_repair_chance = 1.0
   and gc_grace = 60
   and min_compaction_threshold = 4
   and max_compaction_threshold = 32
   and replicate_on_write = true
   and row_cache_provider = 'ConcurrentLinkedHashCacheProvider'
   and compaction_strategy = 
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
   and comment = 'Stores file meta data'
   and column_metadata = [
 {column_name : 'b@70617468',
 validation_class : BytesType,
 index_name : 'path',
 index_type : 0
 },
 {column_name : 'b@73656e74696e656c',
 validation_class : BytesType,
 index_name : 'sentinel',
 index_type : 0
 },
 {column_name : 'b@706172656e745f70617468',
 validation_class : BytesType,
 index_name : 'parent_path',
 index_type : 0
 }];
 create column family inode_archive
   with column_type = 'Standard'
   and comparator = 
 'DynamicCompositeType(t=org.apache.cassandra.db.marshal.TimeUUIDType,s=org.apache.cassandra.db.marshal.UTF8Type,b=org.apache.cassandra.db.marshal.BytesType)'
   and default_validation_class = 'BytesType'
   and key_validation_class = 'BytesType'
   and rows_cached = 0.0
   and row_cache_save_period = 0
   and row_cache_keys_to_save = 2147483647
   and keys_cached = 100.0
   and key_cache_save_period = 14400
   and read_repair_chance = 1.0
   and gc_grace = 60
   and min_compaction_threshold = 4
   and max_compaction_threshold = 32
   and replicate_on_write = true
   and row_cache_provider = 'ConcurrentLinkedHashCacheProvider'
   and compaction_strategy = 
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
   and comment = 'Stores file meta data'
   and column_metadata = [
 {column_name : 'b@70617468',
 validation_class : BytesType,
 index_name : 'path',
 index_type : 0
 },
 {column_name : 'b@73656e74696e656c',
 validation_class : BytesType,
 index_name : 'sentinel',
 index_type : 0
 },
 {column_name : 'b@706172656e745f70617468',
 validation_class : BytesType,
 index_name : 'parent_path',
 index_type : 0
 }];
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2012-01-04 Thread Yuki Morishita (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179584#comment-13179584
]

Yuki Morishita commented on CASSANDRA-3668:
---

Manish,

So, does your test environment have just one cassandra node? How many sstable
files are you loading?
Do you observe any bottleneck during loading to cassandra node? (cpu spikes, io
wait...)

Performance of sstableloader is affected in 1.0.x
-

Key: CASSANDRA-3668
URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
Project: Cassandra
Issue Type: Bug
Components: API
Affects Versions: 1.0.7
Reporter: Manish Zope
Assignee: Yuki Morishita
Fix For: 1.0.7

Attachments: sstable-loader performance.txt

Original Estimate: 48h
Remaining Estimate: 48h

One of my colleague had reported the bug regarding the degraded performance
of the sstable generator and sstable loader.
ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589
As stated in above issue generator performance is rectified but performance
of the sstableloader is still an issue.
3589 is marked as duplicate of 3618.Both issues shows resolved status.But the
problem with sstableloader still exists.
So opening other issue so that sstbleloader problem should not go unnoticed.
FYI : We have tested the generator part with the patch given in 3589.Its
Working fine.
Please let us know if you guys require further inputs from our side.

[jira] [Commented] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2012-01-04 Thread Yuki Morishita (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179598#comment-13179598
]

Yuki Morishita commented on CASSANDRA-3668:
---

CASSANDRA-3494 addresses that streaming is mono threaded, and maybe that is
causing the performance problem.
It is planned for 1.1, but I can rebase it for 1.0 branch for testing purpose.

Performance of sstableloader is affected in 1.0.x
-

Attachments: sstable-loader performance.txt

Original Estimate: 48h
Remaining Estimate: 48h

[jira] [Commented] (CASSANDRA-2805) Clean up mbeans that return Internal Cassandra types

2011-12-30 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177744#comment-13177744
 ] 

Yuki Morishita commented on CASSANDRA-2805:
---

Nick,

- I think double brace initialization should be avoided at 
CompactionInfo#asMap. (yeah, I know Java syntax is sucks.)
- I prefer {{Integer/Long.toString(val)}} over {{new 
Integer/Long(val).toString()}}.

but, otherwise +1.

 Clean up mbeans that return Internal Cassandra types
 

 Key: CASSANDRA-2805
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2805
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Nick Bailey
Assignee: Nick Bailey
Priority: Minor
  Labels: lhf
 Fix For: 1.1

 Attachments: 0001-Don-t-return-internal-types-over-jmx.patch


 We need to clean up wherever we return internal cassandra objects over jmx. 
 Namely CompactionInfo objects as well as Tokens. There may be a few other 
 examples.
 This is bad for two reasons
 1. You have to load the cassandra jar when querying these mbeans, which sucks.
 2. Stuff breaks between versions when things are moved. For example, 
 CASSANDRA-1610 moves the compaction related classes around. Any code querying 
 those jmx mbeans in 0.8.0 is now broken in 0.8.2. (assuming those moves stay 
 in the 0.8 branch)
 For things like CompactionInfo we should just expose more mbean methods or 
 serialize to something standard like json.
 I'd like to target this for 0.8.2. Since we've already broken compatibility 
 between 0.8.0 and 0.8.1, I'd say just fix this everywhere now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2805) Clean up mbeans that return Internal Cassandra types

2011-12-30 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177790#comment-13177790
 ] 

Yuki Morishita commented on CASSANDRA-2805:
---

+1

 Clean up mbeans that return Internal Cassandra types
 

 Key: CASSANDRA-2805
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2805
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Nick Bailey
Assignee: Nick Bailey
Priority: Minor
  Labels: lhf
 Fix For: 1.1

 Attachments: 0001-Don-t-return-internal-types-over-jmx.patch, 
 0001-Don-t-return-internal-types-over-jmx.patch


 We need to clean up wherever we return internal cassandra objects over jmx. 
 Namely CompactionInfo objects as well as Tokens. There may be a few other 
 examples.
 This is bad for two reasons
 1. You have to load the cassandra jar when querying these mbeans, which sucks.
 2. Stuff breaks between versions when things are moved. For example, 
 CASSANDRA-1610 moves the compaction related classes around. Any code querying 
 those jmx mbeans in 0.8.0 is now broken in 0.8.2. (assuming those moves stay 
 in the 0.8 branch)
 For things like CompactionInfo we should just expose more mbean methods or 
 serialize to something standard like json.
 I'd like to target this for 0.8.2. Since we've already broken compatibility 
 between 0.8.0 and 0.8.1, I'd say just fix this everywhere now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3497) BloomFilter FP ratio should be configurable or size-restricted some other way

2011-12-23 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175462#comment-13175462
 ] 

Yuki Morishita commented on CASSANDRA-3497:
---

+1

 BloomFilter FP ratio should be configurable or size-restricted some other way
 -

 Key: CASSANDRA-3497
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3497
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.0.7

 Attachments: 3497-v3.txt, 3497-v4.txt, CASSANDRA-1.0-3497.txt


 When you have a live dc and purely analytical dc, in many situations you can 
 have less nodes on the analytical side, but end up getting restricted by 
 having the BloomFilters in-memory, even though you have absolutely no use for 
 them.  It would be nice if you could reduce this memory requirement by tuning 
 the desired FP ratio, or even just disabling them altogether.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3497) BloomFilter FP ratio should be configurable or size-restricted some other way

2011-12-22 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175279#comment-13175279
 ] 

Yuki Morishita commented on CASSANDRA-3497:
---

Jonathan,

Yours is what I first tried, but instead I tried to do it in SSTR, and I think 
that is what we can do best for 1.0.x.
One thing to point out is that it NPE when fpChance is null and try to convert 
it to double at SSTableWriter.java#403.


 BloomFilter FP ratio should be configurable or size-restricted some other way
 -

 Key: CASSANDRA-3497
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3497
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.0.7

 Attachments: 3497-v3.txt, CASSANDRA-1.0-3497.txt


 When you have a live dc and purely analytical dc, in many situations you can 
 have less nodes on the analytical side, but end up getting restricted by 
 having the BloomFilters in-memory, even though you have absolutely no use for 
 them.  It would be nice if you could reduce this memory requirement by tuning 
 the desired FP ratio, or even just disabling them altogether.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3497) BloomFilter FP ratio should be configurable or size-restricted some other way

2011-12-21 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174278#comment-13174278
 ] 

Yuki Morishita commented on CASSANDRA-3497:
---

The problem is that currently strategy_options for NTS is thoroughly for 
replication setting, for example {DC1:2, DC2:2}.
We can do like strategy_options={DC1:2, DC2:1, DC2:fp(0.5)} or 
strategy_options={DC1:2, DC2:1,fp(0.5)} or something  preserving backward 
compatibility, but I think it's complicated.

Maybe easiest fix is to have node-wide setting for fp ratio in cassandra.yaml 
(w/ jmx interface exposed) and have different values for each datacenter?

 BloomFilter FP ratio should be configurable or size-restricted some other way
 -

 Key: CASSANDRA-3497
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3497
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.0.7


 When you have a live dc and purely analytical dc, in many situations you can 
 have less nodes on the analytical side, but end up getting restricted by 
 having the BloomFilters in-memory, even though you have absolutely no use for 
 them.  It would be nice if you could reduce this memory requirement by tuning 
 the desired FP ratio, or even just disabling them altogether.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3571) make stream throttling configurable at runtime with nodetool

2011-12-20 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13173291#comment-13173291
 ] 

Yuki Morishita commented on CASSANDRA-3571:
---

+1

 make stream throttling configurable at runtime with nodetool
 

 Key: CASSANDRA-3571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3571
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Attachments: CASSANDRA-3571-1.0-rebased-v2.txt, 
 CASSANDRA-3571-1.0-rebased.txt, CASSANDRA-3571-1.0.txt


 Attaching patch that does this, against 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3571) make stream throttling configurable at runtime with nodetool

2011-12-19 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172399#comment-13172399
 ] 

Yuki Morishita commented on CASSANDRA-3571:
---

Peter,

Thanks for the update.
I think it is better to have getter because when I look up 
StreamThroughputMbPerSec value using jconsole, I cannot see what current value 
is set.

 make stream throttling configurable at runtime with nodetool
 

 Key: CASSANDRA-3571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3571
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Attachments: CASSANDRA-3571-1.0-rebased.txt, CASSANDRA-3571-1.0.txt


 Attaching patch that does this, against 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3619) Use a separate writer thread for the SSTableSimpleUnsortedWriter

2011-12-16 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171094#comment-13171094
 ] 

Yuki Morishita commented on CASSANDRA-3619:
---

+1

 Use a separate writer thread for the SSTableSimpleUnsortedWriter
 

 Key: CASSANDRA-3619
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3619
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 0.8.1
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 1.1

 Attachments: 0001-Add-separate-writer-thread.patch


 Currently SSTableSimpleUnsortedWriter doesn't use any threading. This means 
 that the thread using it is blocked while the buffered data is written on 
 disk and that nothing is written on disk while data is added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3571) make stream throttling configurable at runtime with nodetool

2011-12-16 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171124#comment-13171124
 ] 

Yuki Morishita commented on CASSANDRA-3571:
---

Peter, could you rebase this patch so I can review?

 make stream throttling configurable at runtime with nodetool
 

 Key: CASSANDRA-3571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3571
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Attachments: CASSANDRA-3571-1.0.txt


 Attaching patch that does this, against 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3494) Streaming is mono-threaded (the bulk loader too by extension)

2011-12-05 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13162699#comment-13162699
 ] 

Yuki Morishita commented on CASSANDRA-3494:
---

Sure. Why not?

 Streaming is mono-threaded (the bulk loader too by extension)
 -

 Key: CASSANDRA-3494
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3494
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Peter Schuller
Priority: Minor
 Attachments: CASSANDRA-3494-0.8-prelim.txt, CASSANDRA-3494-1.0.txt


 The streamExecutor is define as:
 {noformat}
 streamExecutor_ = new DebuggableThreadPoolExecutor(Streaming, 
 Thread.MIN_PRIORITY);
 {noformat}
 In the meantime, in DebuggableThreadPoolExecutor.java:
 {noformat}
 public DebuggableThreadPoolExecutor(String threadPoolName, int priority)
 {
this(1, Integer.MAX_VALUE, TimeUnit.SECONDS, new 
 LinkedBlockingQueueRunnable(), new NamedThreadFactory(threadPoolName, 
 priority));
 }
 {noformat}
 In other word, since the core pool size is 1 and the queue unbounded, tasks 
 will always queued and the executor is essentially mono-threaded.
 This is clearly not necessary since we already have stream throttling 
 nowadays. And it could be a limiting factor in the case of the bulk loader.
 Besides, I would venture that this maybe was not the intention, because 
 putting the max core size to MAX_VALUE would suggest that the intention was 
 to spawn threads on demand. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3494) Streaming is mono-threaded (the bulk loader too by extension)

2011-12-04 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13162627#comment-13162627
 ] 

Yuki Morishita commented on CASSANDRA-3494:
---

I like Peter's idea about having one executor per destination.
+1 on patch against 1.0. (Patch for 0.8 needs rebase but the change is 
essentially the same.)

 Streaming is mono-threaded (the bulk loader too by extension)
 -

 Key: CASSANDRA-3494
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3494
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Peter Schuller
Priority: Minor
 Attachments: CASSANDRA-3494-0.8-prelim.txt, CASSANDRA-3494-1.0.txt


 The streamExecutor is define as:
 {noformat}
 streamExecutor_ = new DebuggableThreadPoolExecutor(Streaming, 
 Thread.MIN_PRIORITY);
 {noformat}
 In the meantime, in DebuggableThreadPoolExecutor.java:
 {noformat}
 public DebuggableThreadPoolExecutor(String threadPoolName, int priority)
 {
this(1, Integer.MAX_VALUE, TimeUnit.SECONDS, new 
 LinkedBlockingQueueRunnable(), new NamedThreadFactory(threadPoolName, 
 priority));
 }
 {noformat}
 In other word, since the core pool size is 1 and the queue unbounded, tasks 
 will always queued and the executor is essentially mono-threaded.
 This is clearly not necessary since we already have stream throttling 
 nowadays. And it could be a limiting factor in the case of the bulk loader.
 Besides, I would venture that this maybe was not the intention, because 
 putting the max core size to MAX_VALUE would suggest that the intention was 
 to spawn threads on demand. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3045) Update ColumnFamilyOutputFormat to use new bulkload API

2011-11-30 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13159992#comment-13159992
 ] 

Yuki Morishita commented on CASSANDRA-3045:
---

Brandon, patches did apply. I was on wrong branch.

So, streaming part is good(I know sender hangs when io error occurs as Brandon 
mentioned above).
For BulkRecordWriter, I don't think it work for SuperColumns.

BulkRecordWriter#77:

{code}
this.isSuper = Boolean.valueOf(IS_SUPERCF);
{code}

should be

{code}
this.isSuper = Boolean.valueOf(conf.get(IS_SUPERCF));
{code}

 Update ColumnFamilyOutputFormat to use new bulkload API
 ---

 Key: CASSANDRA-3045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3045
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Jonathan Ellis
Assignee: Brandon Williams
Priority: Minor
 Fix For: 1.1

 Attachments: 0001-Remove-gossip-SS-requirement-from-BulkLoader.txt, 
 0002-Allow-DD-loading-without-yaml.txt, 
 0003-hadoop-output-support-for-bulk-loading.txt


 The bulk loading interface added in CASSANDRA-1278 is a great fit for Hadoop 
 jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2967) Only bind JMX to the same IP address that is being used in Cassandra

2011-11-29 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13159183#comment-13159183
 ] 

Yuki Morishita commented on CASSANDRA-2967:
---

Patch works fine for basic use case(binding agent to specified address), but we 
we cannot use SSL or password based auth any more, which out-of-the-box JMX 
agent supports via system properties.
AFAIK you have to implement those to JmxRemoteListener like the one described 
below.

http://docs.oracle.com/javase/6/docs/technotes/guides/management/agent.html#gdfvv

I don't know how many people need those functionalities(SSL/auth), but since 
default JMX agent supports those, we also should add those functionalities.

I think it would be better to provide javaagent version of module like Vijay 
implemented with SSL/Auth support. Or maybe just give pointer to his module in 
somewhere?

 Only bind JMX to the same IP address that is being used in Cassandra
 

 Key: CASSANDRA-2967
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2967
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 0.8.2
Reporter: Joaquin Casares
Assignee: Alex Araujo
Priority: Minor
  Labels: lhf
 Attachments: cassandra-0.8-2967.txt, cassandra-1.0-2967-v2.txt, 
 cassandra-1.0-2967-v3.txt, cassandra-1.0-2967-v4.txt


 The setup is 5 nodes in each data center are all running on one physical test 
 machine and even though the repair was run against the correct IP the wrong 
 JMX port was used. As a result, instead of repairing all 5 nodes I was 
 repairing the same node 5 times.
 It would be nice if Cassandra's JMX would bind to only the IP address on 
 which its thrift/RPC services are listening on instead of binding to all IP's 
 on the box.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3505) tombstone appears after truncate

2011-11-22 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155300#comment-13155300
 ] 

Yuki Morishita commented on CASSANDRA-3505:
---

This change in behavior comes from CASSANDRA-3424, which is fixed in 1.0.3.
See Jonathan's commented in above 
ticket(https://issues.apache.org/jira/browse/CASSANDRA-3424?focusedCommentId=13145954page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13145954)
 for current behavior.

 tombstone appears after truncate
 

 Key: CASSANDRA-3505
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3505
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.3
Reporter: Cathy Daw
Assignee: Yuki Morishita
 Fix For: 1.0.4


 This bug is regarding the select after the 'truncate'.  In 1.0.1 no rows 
 would ever be returned, but now we are seeing a tombstone when querying for 
 user1.  Jake mentioned this may be related to CASSANDRA-2855. 
 {code}
 cqlsh CREATE KEYSPACE ks1 with 
...   strategy_class =  
... 'org.apache.cassandra.locator.SimpleStrategy' 
...   and strategy_options:replication_factor=1;
   
 cqlsh use ks1;
 cqlsh:ks1 
 cqlsh:ks1 CREATE COLUMNFAMILY users (
...   KEY varchar PRIMARY KEY, password varchar, gender varchar,
...   session_token varchar, state varchar, birth_year bigint);
 cqlsh:ks1 INSERT INTO users (KEY, password) VALUES ('user1', 'ch@ngem3a');
 cqlsh:ks1 UPDATE users SET gender = 'm', birth_year = '1980' WHERE KEY = 
 'user1';
 cqlsh:ks1 SELECT * FROM users WHERE key='user1';
KEY | birth_year | gender |  password |
  user1 |   1980 |  m | ch@ngem3a |
 cqlsh:ks1 TRUNCATE users;
 // Expected, no rows returned
 cqlsh:ks1 SELECT * FROM users WHERE key='user1';
KEY |
  user1 |
 // Expected, no rows returned
 cqlsh:ks1 SELECT * FROM users;
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3514) CounterColumnFamily Compaction error (ArrayIndexOutOfBoundsException)

2011-11-22 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155603#comment-13155603
 ] 

Yuki Morishita commented on CASSANDRA-3514:
---

+1

 CounterColumnFamily Compaction error (ArrayIndexOutOfBoundsException) 
 --

 Key: CASSANDRA-3514
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3514
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.3
Reporter: Eric Falcao
Assignee: Sylvain Lebresne
  Labels: compaction
 Fix For: 0.8.8, 1.0.4

 Attachments: 3514.patch


 On a single node, I'm seeing the following error when trying to compact a 
 CounterColumnFamily. This appears to have started with version 1.0.3.
 nodetool -h localhost compact TRProd MetricsAllTime
 Error occured during compaction
 java.util.concurrent.ExecutionException: 
 java.lang.ArrayIndexOutOfBoundsException
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
   at 
 org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:250)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:1471)
   at 
 org.apache.cassandra.service.StorageService.forceTableCompaction(StorageService.java:1523)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
   at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
   at sun.rmi.transport.Transport$1.run(Transport.java:159)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 Caused by: java.lang.ArrayIndexOutOfBoundsException
   at 
 org.apache.cassandra.utils.ByteBufferUtil.arrayCopy(ByteBufferUtil.java:292)
   at 
 org.apache.cassandra.db.context.CounterContext$ContextState.copyTo(CounterContext.java:792)
   at 
 org.apache.cassandra.db.context.CounterContext.removeOldShards(CounterContext.java:709)
   at 
 org.apache.cassandra.db.CounterColumn.removeOldShards(CounterColumn.java:260)
   at 
 org.apache.cassandra.db.CounterColumn.mergeAndRemoveOldShards(CounterColumn.java:306)
   at 
 org.apache.cassandra.db.CounterColumn.mergeAndRemoveOldShards(CounterColumn.java:271)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:86)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:102)
   at

[jira] [Commented] (CASSANDRA-3514) CounterColumnFamily Compaction error (ArrayIndexOutOfBoundsException)

2011-11-22 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155602#comment-13155602
 ] 

Yuki Morishita commented on CASSANDRA-3514:
---

+1

 CounterColumnFamily Compaction error (ArrayIndexOutOfBoundsException) 
 --

 Key: CASSANDRA-3514
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3514
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.3
Reporter: Eric Falcao
Assignee: Sylvain Lebresne
  Labels: compaction
 Fix For: 0.8.8, 1.0.4

 Attachments: 3514.patch


 On a single node, I'm seeing the following error when trying to compact a 
 CounterColumnFamily. This appears to have started with version 1.0.3.
 nodetool -h localhost compact TRProd MetricsAllTime
 Error occured during compaction
 java.util.concurrent.ExecutionException: 
 java.lang.ArrayIndexOutOfBoundsException
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
   at 
 org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:250)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:1471)
   at 
 org.apache.cassandra.service.StorageService.forceTableCompaction(StorageService.java:1523)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
   at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
   at sun.rmi.transport.Transport$1.run(Transport.java:159)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 Caused by: java.lang.ArrayIndexOutOfBoundsException
   at 
 org.apache.cassandra.utils.ByteBufferUtil.arrayCopy(ByteBufferUtil.java:292)
   at 
 org.apache.cassandra.db.context.CounterContext$ContextState.copyTo(CounterContext.java:792)
   at 
 org.apache.cassandra.db.context.CounterContext.removeOldShards(CounterContext.java:709)
   at 
 org.apache.cassandra.db.CounterColumn.removeOldShards(CounterColumn.java:260)
   at 
 org.apache.cassandra.db.CounterColumn.mergeAndRemoveOldShards(CounterColumn.java:306)
   at 
 org.apache.cassandra.db.CounterColumn.mergeAndRemoveOldShards(CounterColumn.java:271)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:86)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:102)
   at

[jira] [Commented] (CASSANDRA-2967) Only bind JMX to the same IP address that is being used in Cassandra

2011-11-14 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13150184#comment-13150184
 ] 

Yuki Morishita commented on CASSANDRA-2967:
---

Few comments on the patch:

* If you leave -Dcom.sun.management.jmxremote.port=$JMX_PORT inside 
cassandra-env.sh, JVM still expose jmx to all interfaces.
* Applied patch to 1.0, removed above option from env.sh, and accessed jmx via 
nodetool, I get following error. Am I missing something?

{code}
# Inside patched cassandra.yaml, I set the following
jmx_listen_address: 127.0.0.2
jmx_registry_port: 7200
jmx_server_port: 7100

$ bin/nodetool -h 127.0.0.2 -p 7100 ring
Error connection to remote JMX agent!
java.io.IOException: Failed to retrieve RMIServer stub: 
javax.naming.CommunicationException [Root exception is 
java.rmi.NoSuchObjectException: no such object in table]
at 
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:340)
at 
javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248)
at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:143)
at org.apache.cassandra.tools.NodeProbe.init(NodeProbe.java:113)
at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:585)
Caused by: javax.naming.CommunicationException [Root exception is 
java.rmi.NoSuchObjectException: no such object in table]
at 
com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:101)
at 
com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:185)
at javax.naming.InitialContext.lookup(InitialContext.java:392)
at 
javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1888)
at 
javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1858)
at 
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:257)
... 4 more
Caused by: java.rmi.NoSuchObjectException: no such object in table
at 
sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:255)
at 
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:233)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:359)
at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
at 
com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:97)
... 9 more
{code}

 Only bind JMX to the same IP address that is being used in Cassandra
 

 Key: CASSANDRA-2967
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2967
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 0.8.2
Reporter: Joaquin Casares
Assignee: Alex Araujo
Priority: Minor
  Labels: lhf
 Attachments: cassandra-0.8-2967.txt, cassandra-1.0-2967-v2.txt, 
 cassandra-1.0-2967-v3.txt


 The setup is 5 nodes in each data center are all running on one physical test 
 machine and even though the repair was run against the correct IP the wrong 
 JMX port was used. As a result, instead of repairing all 5 nodes I was 
 repairing the same node 5 times.
 It would be nice if Cassandra's JMX would bind to only the IP address on 
 which its thrift/RPC services are listening on instead of binding to all IP's 
 on the box.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3478) Add support for sstable forwards-compatibility

2011-11-09 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147395#comment-13147395
 ] 

Yuki Morishita commented on CASSANDRA-3478:
---

+1 on change, but it breaks BootstrapTest.

 Add support for sstable forwards-compatibility
 --

 Key: CASSANDRA-3478
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3478
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 1.0.3

 Attachments: 3478.txt


 Following on to CASSANDRA-3470.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3178) Counter shard merging is not thread safe

2011-11-04 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143848#comment-13143848
 ] 

Yuki Morishita commented on CASSANDRA-3178:
---

+!. 1.0 patch also works fine.

 Counter shard merging is not thread safe
 

 Key: CASSANDRA-3178
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3178
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.5
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
  Labels: counters
 Fix For: 0.8.8

 Attachments: 
 0001-Move-shard-merging-completely-to-compaction-1.0.patch, 
 0001-Move-shard-merging-completely-to-compaction-v2.patch, 
 0001-Move-shard-merging-completely-to-compaction.patch, 
 0002-Simplify-improve-shard-merging-code-1.0.patch, 
 0002-Simplify-improve-shard-merging-code-v2.patch, 
 0002-Simplify-improve-shard-merging-code.patch


 The first part of the counter shard merging process is done during counter 
 replication. This was done there because it requires that all replica are 
 made aware of the merging (we could only rely on nodetool repair for that but 
 that seems much too fragile, it's better as just a safety net). However this 
 part isn't thread safe as multiple threads can do the merging for the same 
 shard at the same time (which shouldn't really corrupt the counter value 
 per se, but result in an incorrect context).
 Synchronizing that part of the code would be very costly in term of 
 performance, so instance I propose to move the part of the shard merging done 
 during replication to compaction. It's a better place anyway. The only 
 downside is that it means compaction will sometime send mutations to other 
 node as a side effect, which doesn't feel very clean but is probably not a 
 big deal either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3178) Counter shard merging is not thread safe

2011-11-02 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142240#comment-13142240
 ] 

Yuki Morishita commented on CASSANDRA-3178:
---

LGTM on 0.8 branch. I think it's safe to apply this on 1.0, but before that I 
want to make sure it works. Could you modify the patch so that I can test on 
1.0?

 Counter shard merging is not thread safe
 

 Key: CASSANDRA-3178
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3178
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.5
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
  Labels: counters
 Fix For: 0.8.8

 Attachments: 
 0001-Move-shard-merging-completely-to-compaction-v2.patch, 
 0001-Move-shard-merging-completely-to-compaction.patch, 
 0002-Simplify-improve-shard-merging-code-v2.patch, 
 0002-Simplify-improve-shard-merging-code.patch


 The first part of the counter shard merging process is done during counter 
 replication. This was done there because it requires that all replica are 
 made aware of the merging (we could only rely on nodetool repair for that but 
 that seems much too fragile, it's better as just a safety net). However this 
 part isn't thread safe as multiple threads can do the merging for the same 
 shard at the same time (which shouldn't really corrupt the counter value 
 per se, but result in an incorrect context).
 Synchronizing that part of the code would be very costly in term of 
 performance, so instance I propose to move the part of the shard merging done 
 during replication to compaction. It's a better place anyway. The only 
 downside is that it means compaction will sometime send mutations to other 
 node as a side effect, which doesn't feel very clean but is probably not a 
 big deal either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2997) Enhance human-readability of snapshot names created by drop column family

2011-10-22 Thread Yuki Morishita (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133541#comment-13133541
 ] 

Yuki Morishita commented on CASSANDRA-2997:
---

+1 on v2

 Enhance human-readability of snapshot names created by drop column family
 -

 Key: CASSANDRA-2997
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2997
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 0.7.0
Reporter: Eric Gilmore
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.0.1

 Attachments: 0001-Add-CF-name-to-snapshot-dir-when-dropping.patch, 
 2997-v2.txt


 Currently when you drop a column family, a snapshot is automatically created 
 in a directory named with the timestamp of the drop.  
 Clever folk will find a way to understand the timestamps and locate 
 particular snapshots, but it is not as effortless as it could be if part or 
 all of the CF name was included in the snapshot name.
 Any strategy to make the snapshot name more user-friendly and easy to find 
 would be helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

58 matches

Mail list logo