Re: jmxterm "#NullPointerException: No such PID "

2018-09-18 Thread Yuki Morishita
This is because Cassandra sets -XX:+PerfDisableSharedMem JVM option by default.
This prevents tools such as jps to list jvm processes.
See https://issues.apache.org/jira/browse/CASSANDRA-9242 for detail.

You can work around by doing what Riccardo said.
On Tue, Sep 18, 2018 at 9:41 PM Philip Ó Condúin
 wrote:
>
> Hi Riccardo,
>
> Yes that works for me:
>
> Welcome to JMX terminal. Type "help" for available commands.
> $> open localhost:7199
> #Connection to localhost:7199 is opened
> $>domains
> #following domains are available
> JMImplementation
> ch.qos.logback.classic
> com.sun.management
> java.lang
> java.nio
> java.util.logging
> org.apache.cassandra.db
> org.apache.cassandra.hints
> org.apache.cassandra.internal
> org.apache.cassandra.metrics
> org.apache.cassandra.net
> org.apache.cassandra.request
> org.apache.cassandra.service
> $>
>
> I can work with this :-)
>
> Not sure why the JVM is not listed when issuing the JVMS command, maybe its a 
> server setting, our production servers find the Cass JVM.  I've spent half 
> the day trying to figure it out so I think I'll just put it to bed now and 
> work on something else.
>
> Regards,
> Phil
>
> On Tue, 18 Sep 2018 at 13:34, Riccardo Ferrari  wrote:
>>
>> Hi Philip,
>>
>> I've used jmxterm myself without any problems particular problems. On my 
>> systems too, I don't get the cassandra daemon listed when issuing the `jvms` 
>> command but I never spent much time investigating it.
>> Assuming you have not changed anything relevant in the cassandra-env.sh you 
>> can connect using jmxterm by issuing 'open 127.0.0.1:7199'. Would that work 
>> for you?
>>
>> HTH,
>>
>>
>>
>> On Tue, Sep 18, 2018 at 2:00 PM, Philip Ó Condúin  
>> wrote:
>>>
>>> Further info:
>>>
>>> I would expect to see the following when I list the jvm's:
>>>
>>> Welcome to JMX terminal. Type "help" for available commands.
>>> $>jvms
>>> 25815(m) - org.apache.cassandra.service.CassandraDaemon
>>> 17628( ) - jmxterm-1.0-alpha-4-uber.jar
>>>
>>> But jmxtem is not picking up the JVM for Cassandra for some reason.
>>>
>>> Can someone point me in the right direction?  Is there settings in the 
>>> cassandra-env.sh file I need to amend to get jmxterm to find the cass jvm?
>>>
>>> Im not finding much about it on google.
>>>
>>> Thanks,
>>> Phil
>>>
>>>
>>> On Tue, 18 Sep 2018 at 12:09, Philip Ó Condúin  
>>> wrote:

 Hi All,

 I need a little advice.  I'm trying to access the JMX terminal using 
 jmxterm-1.0-alpha-4-uber.jar with a very simple default install of C* 
 3.11.3

 I keep getting the following:

 [cassandra@reaper-1 conf]$ java -jar jmxterm-1.0-alpha-4-uber.jar
 Welcome to JMX terminal. Type "help" for available commands.
 $>open 1666
 #NullPointerException: No such PID 1666
 $>

 C* is running with a PID of 1666.  I've tried setting JMX_LOCAL=no and 
 have even created a new VM to test it.

 Does anyone know what I might be doing wrong here?

 Kind Regards,
 Phil

>>>
>>>
>>> --
>>> Regards,
>>> Phil
>>
>>
>
>
> --
> Regards,
> Phil

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: why I got error "Could not retrieve endpoint rangs" when I run sstableloader?

2015-12-28 Thread Yuki Morishita
You only need patch for sstableloader.
You don't have to upgrade your cassandra servers at all.

So,

1. fetch the latest cassandra-2.1 source
$ git clone https://git-wip-us.apache.org/repos/asf/cassandra.git
$ cd cassandra
$ git checkout origin/cassandra-2.1
2. build it
$ ant
3. use sstableloader you just built
$ bin/sstableloader 



On Mon, Dec 28, 2015 at 6:03 PM, 土卜皿 <pengcz.n...@gmail.com> wrote:
> hi, Yuki
>Thank you very much!
> The issue's description almost fits to my case!
> 1. My Cassandra version is 2.1.11
>  2.  my table has several colomn with collection type
>  3.  Before failed this time, I can use sstableloader to load the data
> into this table,  but
>  I got this error after I drop one column with collection type and
> insert a column with int type
> Do you think I will resolve my question if I  update the version into
> 2.1.13?
>
> And, my table already had 560 millions of records. So, for resolving this,
> Whether I only need to update the new version C*.jar
> and restart  cassandra?
>
> Dillon
>
> 2015-12-29 7:36 GMT+08:00 Yuki Morishita <mor.y...@gmail.com>:
>>
>> This is known issue.
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-10700
>>
>> It is fixed in not-yet-released version 2.1.13.
>> So, you need to build from the latest cassandra-2.1 branch to try.
>>
>>
>> On Mon, Dec 28, 2015 at 5:28 PM, 土卜皿 <pengcz.n...@gmail.com> wrote:
>> > hi, all
>> >I used the sstableloader many times successfully, but I got the
>> > following
>> > error:
>> >
>> > [root@localhost pengcz]# /usr/local/cassandra/bin/sstableloader -u user
>> > -pw
>> > password -v -d 172.21.0.131 ./currentdata/keyspace/table
>> >
>> > Could not retrieve endpoint ranges:
>> > java.lang.IllegalArgumentException
>> > java.lang.RuntimeException: Could not retrieve endpoint ranges:
>> > at
>> >
>> > org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:338)
>> > at
>> >
>> > org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:156)
>> > at
>> > org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:106)
>> > Caused by: java.lang.IllegalArgumentException
>> > at java.nio.Buffer.limit(Buffer.java:267)
>> > at
>> >
>> > org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543)
>> > at
>> >
>> > org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:124)
>> > at
>> >
>> > org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:101)
>> > at
>> >
>> > org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:30)
>> > at
>> >
>> > org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:50)
>> > at
>> >
>> > org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:68)
>> > at
>> >
>> > org.apache.cassandra.cql3.UntypedResultSet$Row.getMap(UntypedResultSet.java:287)
>> > at
>> >
>> > org.apache.cassandra.config.CFMetaData.fromSchemaNoTriggers(CFMetaData.java:1833)
>> > at
>> >
>> > org.apache.cassandra.config.CFMetaData.fromThriftCqlRow(CFMetaData.java:1126)
>> > at
>> >
>> > org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:330)
>> > ... 2 more
>> >
>> > I don't know whether this error is relative to one of cluster nodes'
>> > linux
>> > crash?
>> >
>> > Any advice will be appreciated!
>> >
>> > Dillon Peng
>>
>>
>>
>> --
>> Yuki Morishita
>>  t:yukim (http://twitter.com/yukim)
>
>



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: why I got error "Could not retrieve endpoint rangs" when I run sstableloader?

2015-12-28 Thread Yuki Morishita
This is known issue.

https://issues.apache.org/jira/browse/CASSANDRA-10700

It is fixed in not-yet-released version 2.1.13.
So, you need to build from the latest cassandra-2.1 branch to try.


On Mon, Dec 28, 2015 at 5:28 PM, 土卜皿 <pengcz.n...@gmail.com> wrote:
> hi, all
>I used the sstableloader many times successfully, but I got the following
> error:
>
> [root@localhost pengcz]# /usr/local/cassandra/bin/sstableloader -u user -pw
> password -v -d 172.21.0.131 ./currentdata/keyspace/table
>
> Could not retrieve endpoint ranges:
> java.lang.IllegalArgumentException
> java.lang.RuntimeException: Could not retrieve endpoint ranges:
> at
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:338)
> at
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:156)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:106)
> Caused by: java.lang.IllegalArgumentException
> at java.nio.Buffer.limit(Buffer.java:267)
> at
> org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543)
> at
> org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:124)
> at
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:101)
> at
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:30)
> at
> org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:50)
> at
> org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:68)
> at
> org.apache.cassandra.cql3.UntypedResultSet$Row.getMap(UntypedResultSet.java:287)
> at
> org.apache.cassandra.config.CFMetaData.fromSchemaNoTriggers(CFMetaData.java:1833)
> at
> org.apache.cassandra.config.CFMetaData.fromThriftCqlRow(CFMetaData.java:1126)
> at
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:330)
> ... 2 more
>
> I don't know whether this error is relative to one of cluster nodes' linux
> crash?
>
> Any advice will be appreciated!
>
> Dillon Peng



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: java.io.IOException: Failed during snapshot creation

2015-05-25 Thread Yuki Morishita
Hi, it can be timimg out waiting for snapshot to complete.
See https://issues.apache.org/jira/browse/CASSANDRA-8696 for workaround.

On Mon, May 25, 2015 at 10:44 AM, Mark Reddy mark.l.re...@gmail.com wrote:
 Can you check your logs for any other other error message around the time of
 the repair? Something to look for would be Error occurred during snapshot
 phase.


 Regards,
 Mark

 On 25 May 2015 at 14:56, Sachin PK sachinpray...@gmail.com wrote:

 Hey I'm new to Cassandra ,I have 4 node cluster with each node 16GB VPS,
 initially I had one seed node, I added one of the existing nodes as seed
 nodes , restarted nodes one by one after that one of my node went down . I
 ran nodetool repair on it when I checked the log i could find some errors


 ERROR [AntiEntropySessions:3] 2015-05-25 09:26:12,905
 RepairSession.java:303 - [repair #968bce30-02e1-11e5-9310-45191f4c93ae]
 session completed with the following error
 java.io.IOException: Failed during snapshot creation.
 at
 org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
 ~[apache-cassandra-2.1.5.jar:2.1.5]
 at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146)
 ~[apache-cassandra-2.1.5.jar:2.1.5]
 at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
 ~[guava-16.0.jar:na]
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 [na:1.7.0_80]
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 [na:1.7.0_80]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
 ERROR [AntiEntropySessions:3] 2015-05-25 09:26:12,907
 CassandraDaemon.java:223 - Exception in thread
 Thread[AntiEntropySessions:3,5,RMI Runtime]
 java.lang.RuntimeException: java.io.IOException: Failed during snapshot
 creation.
 at com.google.common.base.Throwables.propagate(Throwables.java:160)
 ~[guava-16.0.jar:na]
 at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
 ~[apache-cassandra-2.1.5.jar:2.1.5]
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 ~[na:1.7.0_80]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_80]
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 ~[na:1.7.0_80]
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 [na:1.7.0_80]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
 Caused by: java.io.IOException: Failed during snapshot creation.
 at
 org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
 ~[apache-cassandra-2.1.5.jar:2.1.5]
 at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146)
 ~[apache-cassandra-2.1.5.jar:2.1.5]
 at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
 ~[guava-16.0.jar:na]
 ... 3 common frames omitted






-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Nodetool on 2.1.5

2015-05-21 Thread Yuki Morishita
For security reason, Cassandra changes JMX to listen localhost only
since version 2.0.14/2.1.4.
From NEWS.txt:

The default JMX config now listens to localhost only. You must enable
the other JMX flags in cassandra-env.sh manually. 

On Thu, May 21, 2015 at 11:05 AM, Walsh, Stephen
stephen.wa...@aspect.com wrote:
 Just wondering if anyone else is seeing this issue on the nodetool after
 installing 2.1.5





 This works

 nodetool -h 127.0.0.1 cfstats keyspace.table



 This works

 nodetool -h localhost cfstats keyspace.table



 This works

 nodetool cfstats keyspace.table



 This doesn’t work

 nodetool -h 192.168.1.10 cfstats keyspace.table

 nodetool: Failed to connect to ‘192.168.1.10:7199' - ConnectException:
 'Connection refused'.



 Where 192.168.1.10 is the machine IP,

 All firewalls are disabled and it worked fine on version 2.0.13



 This has happened on both of our upgraded clusters.

 Also no longer able to view the “CF: Total MemTable Size”  “flushes
 pending” in Ops Center 5.1.1, related issue?



 This email (including any attachments) is proprietary to Aspect Software,
 Inc. and may contain information that is confidential. If you have received
 this message in error, please do not read, copy or forward this message.
 Please notify the sender immediately, delete it from your system and destroy
 any copies. You may not further disclose or distribute this email or its
 attachments.



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Tables showing up as our_table-147a2090ed4211e480153bc81e542ebd/ in data dir

2015-04-29 Thread Yuki Morishita
Directory structure is changed in 2.1 to prevent various problems
caused by DROP/re-CREATE the same table
(https://issues.apache.org/jira/browse/CASSANDRA-5202).
From NEWS.txt:

2.1
===
New features

   ...
   - SSTable data directory name is slightly changed. Each directory will
  have hex string appended after CF name, e.g.
  ks/cf-5be396077b811e3a3ab9dc4b9ac088d/
  This hex string part represents unique ColumnFamily ID.
  Note that existing directories are used as is, so only newly created
  directories after upgrade have new directory name format.

On Wed, Apr 29, 2015 at 2:04 PM, Donald Smith
donald.sm...@audiencescience.com wrote:
 Using 2.1.4, tables in our data/ directory are showing up as


 our_table-147a2090ed4211e480153bc81e542ebd/


 instead of as


  our_table/


 Why would that happen? We're also seeing lagging compactions and high cpu
 usage.


  Thanks, Don



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Confirming Repairs

2015-04-24 Thread Yuki Morishita
In 3.0, we have system table that stores repair history.
https://issues.apache.org/jira/browse/CASSANDRA-5839
So you can just use CQL to check when given ks/cf is repaired.

On Sat, Apr 25, 2015 at 5:23 AM, Jeff Ferland j...@tubularlabs.com wrote:
 The short answer is I used a logstash query to get a list of all repair
 ranges started and all ranges completed. I then matched the UUID of the
 start message to the end message and printed out all the ranges that didn't
 succeed. Then one needs to go a step further than I've coded and match the
 remaining ranges to at least one node in the ring that would hold a replica
 for the keyspace.

 Does anybody have a better way to handle this yet? Will the 3.0 series
 logging of repairs to the system keyspace be able to give me this same kind
 of confirmation that everything in a given keyspace was last repaired as of
 $DATE, or that to repair everything as of $DATE I must repair ranges $X?

 https://gist.github.com/autocracy/9467eaaff581ff24334c



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: nodetool cleanup error

2015-03-30 Thread Yuki Morishita
Looks like the issue is https://issues.apache.org/jira/browse/CASSANDRA-9070.

On Mon, Mar 30, 2015 at 6:25 PM, Robert Coli rc...@eventbrite.com wrote:
 On Mon, Mar 30, 2015 at 4:21 PM, Amlan Roy amlan@cleartrip.com wrote:

 Thanks for the reply. I have upgraded to 2.0.13. Now I get the following
 error.


 If cleanup is still excepting for you on 2.0.13 with some sstables you have,
 I would strongly consider :

 1) file a JIRA (http://issues.apache.org) and attach / offer the sstables
 for debugging
 2) let the list know the JIRA id of the ticket

 =Rob




-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Streaming failures during bulkloading data using CqlBulkOutputFormat

2015-03-05 Thread Yuki Morishita
Thanks!

On Thu, Mar 5, 2015 at 11:10 AM, Aby Kuruvilla
aby.kuruvi...@envisagesystems.com wrote:
 Thanks Yuki, have created a JIRA ticket

 https://issues.apache.org/jira/browse/CASSANDRA-8924

 On Thu, Mar 5, 2015 at 10:34 AM, Yuki Morishita mor.y...@gmail.com wrote:

 Thanks.
 It looks like a bug. Can you create a ticket on JIRA?

 https://issues.apache.org/jira/browse/CASSANDRA

 On Thu, Mar 5, 2015 at 7:56 AM, Aby Kuruvilla
 aby.kuruvi...@envisagesystems.com wrote:
  Hi Yuki
 
  Thanks for the reply!
 
  Here is the log from Cassandra server for the stream failure
 
  INFO  [STREAM-INIT-/192.168.56.1:58578] 2015-03-04 09:20:23,816
  StreamResultFuture.java:109 - [Stream
  #98ba8730-c279-11e4-b8e9-55374d280508
  ID#0] Creating new streaming plan for Bulk Load
  INFO  [STREAM-INIT-/192.168.56.1:58578] 2015-03-04 09:20:23,816
  StreamResultFuture.java:116 - [Stream
  #98ba8730-c279-11e4-b8e9-55374d280508,
  ID#0] Received streaming plan for Bulk Load
  INFO  [STREAM-INIT-/192.168.56.1:58579] 2015-03-04 09:20:23,819
  StreamResultFuture.java:116 - [Stream
  #98ba8730-c279-11e4-b8e9-55374d280508,
  ID#0] Received streaming plan for Bulk Load
  INFO  [STREAM-IN-/127.0.0.1] 2015-03-04 09:20:23,822
  StreamResultFuture.java:166 - [Stream
  #98ba8730-c279-11e4-b8e9-55374d280508
  ID#0] Prepare completed. Receiving 1 files(617874 bytes), sending 0
  files(0
  bytes)
  WARN  [STREAM-IN-/127.0.0.1] 2015-03-04 09:20:23,823
  StreamSession.java:597
  - [Stream #98ba8730-c279-11e4-b8e9-55374d280508] Retrying for following
  error
  java.io.IOException: CF d6d35793-729c-3cab-bee0-84e971e48675 was dropped
  during streaming
  at
 
  org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:71)
  ~[main/:na]
  at
 
  org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:48)
  [main/:na]
  at
 
  org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:38)
  [main/:na]
  at
 
  org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
  [main/:na]
  at
 
  org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)
  [main/:na]
  at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
  ERROR [STREAM-IN-/127.0.0.1] 2015-03-04 09:20:23,828
  StreamSession.java:477
  - [Stream #98ba8730-c279-11e4-b8e9-55374d280508] Streaming error
  occurred
  java.lang.IllegalArgumentException: Unknown type 0
  at
 
  org.apache.cassandra.streaming.messages.StreamMessage$Type.get(StreamMessage.java:89)
  ~[main/:na]
  at
 
  org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54)
  ~[main/:na]
  at
 
  org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)
  ~[main/:na]
  at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
  INFO  [STREAM-IN-/127.0.0.1] 2015-03-04 09:20:23,829
  StreamResultFuture.java:180 - [Stream
  #98ba8730-c279-11e4-b8e9-55374d280508]
  Session with /127.0.0.1 is complete
  WARN  [STREAM-IN-/127.0.0.1] 2015-03-04 09:20:23,829
  StreamResultFuture.java:207 - [Stream
  #98ba8730-c279-11e4-b8e9-55374d280508]
  Stream failed
 
 
 
  On Wed, Mar 4, 2015 at 1:18 PM, Yuki Morishita mor.y...@gmail.com
  wrote:
 
  Do you have corresponding error in the other side of the stream
  (/192.168.56.11)?
 
 
  On Wed, Mar 4, 2015 at 9:11 AM, Aby Kuruvilla
  aby.kuruvi...@envisagesystems.com wrote:
   I am trying to use the CqlBulkOutputFormat in a Hadoop job to bulk
   load
   data
   into Cassandra.  Was not able to find any documentation of this new
   output
   format , but from looking through the code this uses CQLSSTableWriter
   to
   write SSTable files to disk , which are then streamed to Cassandra
   using
   SSTableLoader. On running the Hadoop job, I can see that the SSTable
   files
   do get generated but fails to stream the data out. I get the same
   exception
   when I try with Cassndra node on localhost as well as a remote
   Cassandra
   cluster. Also I get this exception on C* versions 2.1.1,  2.1.2 and
   2.1.3.
  
   Relevant portion of logs and stack trace
  
   09:20:23.207 [Thread-6] WARN  org.apache.cassandra.utils.CLibrary -
   JNA
   link
   failure, one or more native method will be unavailable.
   09:20:23.208 [Thread-6] DEBUG org.apache.cassandra.utils.CLibrary -
   JNA
   link
   failure details: Error looking up function 'posix_fadvise':
   dlsym(0x7fff6ab8a5e0, posix_fadvise): symbol not found
   09:20:23.504 [Thread-6] DEBUG o.apache.cassandra.io.util.FileUtils -
   Renaming
  
  
   /var/folders/bb/c4416mx95xsbb11jx5g5zq15ddhhh4/T/dev/participant-262ce044-0a2d-48f4-9baa-ad4d626e743a/dev-participant-tmp-ka-1-Filter.db
   to
  
  
   /var/folders/bb/c4416mx95xsbb11jx5g5zq15ddhhh4/T/dev/participant-262ce044-0a2d-48f4-9baa-ad4d626e743a/dev

Re: Streaming failures during bulkloading data using CqlBulkOutputFormat

2015-03-05 Thread Yuki Morishita
Thanks.
It looks like a bug. Can you create a ticket on JIRA?

https://issues.apache.org/jira/browse/CASSANDRA

On Thu, Mar 5, 2015 at 7:56 AM, Aby Kuruvilla
aby.kuruvi...@envisagesystems.com wrote:
 Hi Yuki

 Thanks for the reply!

 Here is the log from Cassandra server for the stream failure

 INFO  [STREAM-INIT-/192.168.56.1:58578] 2015-03-04 09:20:23,816
 StreamResultFuture.java:109 - [Stream #98ba8730-c279-11e4-b8e9-55374d280508
 ID#0] Creating new streaming plan for Bulk Load
 INFO  [STREAM-INIT-/192.168.56.1:58578] 2015-03-04 09:20:23,816
 StreamResultFuture.java:116 - [Stream #98ba8730-c279-11e4-b8e9-55374d280508,
 ID#0] Received streaming plan for Bulk Load
 INFO  [STREAM-INIT-/192.168.56.1:58579] 2015-03-04 09:20:23,819
 StreamResultFuture.java:116 - [Stream #98ba8730-c279-11e4-b8e9-55374d280508,
 ID#0] Received streaming plan for Bulk Load
 INFO  [STREAM-IN-/127.0.0.1] 2015-03-04 09:20:23,822
 StreamResultFuture.java:166 - [Stream #98ba8730-c279-11e4-b8e9-55374d280508
 ID#0] Prepare completed. Receiving 1 files(617874 bytes), sending 0 files(0
 bytes)
 WARN  [STREAM-IN-/127.0.0.1] 2015-03-04 09:20:23,823 StreamSession.java:597
 - [Stream #98ba8730-c279-11e4-b8e9-55374d280508] Retrying for following
 error
 java.io.IOException: CF d6d35793-729c-3cab-bee0-84e971e48675 was dropped
 during streaming
 at
 org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:71)
 ~[main/:na]
 at
 org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:48)
 [main/:na]
 at
 org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:38)
 [main/:na]
 at
 org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
 [main/:na]
 at
 org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)
 [main/:na]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
 ERROR [STREAM-IN-/127.0.0.1] 2015-03-04 09:20:23,828 StreamSession.java:477
 - [Stream #98ba8730-c279-11e4-b8e9-55374d280508] Streaming error occurred
 java.lang.IllegalArgumentException: Unknown type 0
 at
 org.apache.cassandra.streaming.messages.StreamMessage$Type.get(StreamMessage.java:89)
 ~[main/:na]
 at
 org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54)
 ~[main/:na]
 at
 org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)
 ~[main/:na]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
 INFO  [STREAM-IN-/127.0.0.1] 2015-03-04 09:20:23,829
 StreamResultFuture.java:180 - [Stream #98ba8730-c279-11e4-b8e9-55374d280508]
 Session with /127.0.0.1 is complete
 WARN  [STREAM-IN-/127.0.0.1] 2015-03-04 09:20:23,829
 StreamResultFuture.java:207 - [Stream #98ba8730-c279-11e4-b8e9-55374d280508]
 Stream failed



 On Wed, Mar 4, 2015 at 1:18 PM, Yuki Morishita mor.y...@gmail.com wrote:

 Do you have corresponding error in the other side of the stream
 (/192.168.56.11)?


 On Wed, Mar 4, 2015 at 9:11 AM, Aby Kuruvilla
 aby.kuruvi...@envisagesystems.com wrote:
  I am trying to use the CqlBulkOutputFormat in a Hadoop job to bulk load
  data
  into Cassandra.  Was not able to find any documentation of this new
  output
  format , but from looking through the code this uses CQLSSTableWriter to
  write SSTable files to disk , which are then streamed to Cassandra using
  SSTableLoader. On running the Hadoop job, I can see that the SSTable
  files
  do get generated but fails to stream the data out. I get the same
  exception
  when I try with Cassndra node on localhost as well as a remote
  Cassandra
  cluster. Also I get this exception on C* versions 2.1.1,  2.1.2 and
  2.1.3.
 
  Relevant portion of logs and stack trace
 
  09:20:23.207 [Thread-6] WARN  org.apache.cassandra.utils.CLibrary - JNA
  link
  failure, one or more native method will be unavailable.
  09:20:23.208 [Thread-6] DEBUG org.apache.cassandra.utils.CLibrary - JNA
  link
  failure details: Error looking up function 'posix_fadvise':
  dlsym(0x7fff6ab8a5e0, posix_fadvise): symbol not found
  09:20:23.504 [Thread-6] DEBUG o.apache.cassandra.io.util.FileUtils -
  Renaming
 
  /var/folders/bb/c4416mx95xsbb11jx5g5zq15ddhhh4/T/dev/participant-262ce044-0a2d-48f4-9baa-ad4d626e743a/dev-participant-tmp-ka-1-Filter.db
  to
 
  /var/folders/bb/c4416mx95xsbb11jx5g5zq15ddhhh4/T/dev/participant-262ce044-0a2d-48f4-9baa-ad4d626e743a/dev-participant-ka-1-Filter.db
  09:20:23.505 [Thread-6] DEBUG o.apache.cassandra.io.util.FileUtils -
  Renaming
 
  /var/folders/bb/c4416mx95xsbb11jx5g5zq15ddhhh4/T/dev/participant-262ce044-0a2d-48f4-9baa-ad4d626e743a/dev-participant-tmp-ka-1-Digest.sha1
  to
 
  /var/folders/bb/c4416mx95xsbb11jx5g5zq15ddhhh4/T/dev/participant-262ce044-0a2d-48f4-9baa-ad4d626e743a/dev-participant-ka-1-Digest.sha1
  09:20:23.505 [Thread-6] DEBUG

Re: Streaming failures during bulkloading data using CqlBulkOutputFormat

2015-03-04 Thread Yuki Morishita
(String[] arg0) throws Exception {
...
Job job = new Job(conf);
..
job.setOutputFormatClass(CqlBulkOutputFormat.class);


 ConfigHelper.setOutputInitialAddress(job.getConfiguration(),
 192.168.56.11);

 ConfigHelper.setOutputPartitioner(job.getConfiguration(),
 Murmur3Partitioner);
ConfigHelper.setOutputRpcPort(job.getConfiguration(),
 9160);

 ConfigHelper.setOutputKeyspace(job.getConfiguration(),
 CASSANDRA_KEYSPACE_NAME);
ConfigHelper.setOutputColumnFamily(
   job.getConfiguration(),
  CASSANDRA_KEYSPACE_NAME,
  CASSANDRA_TABLE_NAME
 );
  //Set the properties for CqlBulkOutputFormat
  MultipleOutputs.addNamedOutput(job,
 CASSANDRA_TABLE_NAME, CqlBulkOutputFormat.class, Object.class, List.class);

 CqlBulkOutputFormat.setColumnFamilySchema(job.getConfiguration(),
 CASSANDRA_TABLE_NAME, CREATE TABLE dev.participant());

 CqlBulkOutputFormat.setColumnFamilyInsertStatement(job.getConfiguration(),
 CASSANDRA_TABLE_NAME, INSERT into dev.participant() values
 (?,?,?,?,?) );

 .
 }

 }

 Reducer Code

 public class ReducerToCassandra extends ReducerText, Text, Object,
 ListByteBuffer {

   private MultipleOutputs multipleOutputs;

   @SuppressWarnings(unchecked)
protected void setup(Context context) throws IOException,
 InterruptedException {
 multipleOutputs = new MultipleOutputs(context);
}

   @Override
public void reduce(Text id, IterableText pInfo, Context context) throws
 IOException, InterruptedException {

ListByteBuffer bVariables = new ArrayListByteBuffer();

   .
   multipleOutputs.write(CASSANDRA_TABLE1, null, bVariables);

 }






-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Nodetool clearsnapshot

2015-01-13 Thread Yuki Morishita
Snapshot during repair is automatically cleared if repair succeeds.
Unfortunately, you have to delete it manually if repair failed or stalled.

On Tue, Jan 13, 2015 at 8:30 AM, Batranut Bogdan batra...@yahoo.com wrote:
 OK Thanks,

 But I also read that repair will take a snapshot. Due to the fact that I
 have Replication factor 3 for my keyspace, I run nodetool clearsnapshot to
 keep disk space use to a minimum. Will this impact my repair?


 On Tuesday, January 13, 2015 4:19 PM, Jan Kesten j.kes...@enercast.de
 wrote:


 Hi,

 I have read that snapshots are basicaly symlinks and they do not take that
 much space.
 Why if I run nodetool clearsnapshot it frees a lot of space? I am seeing GBs
 freed...


 both together makes sense. Creating a snaphot just creates links for all
 files unter the snapshot directory. This is very fast and takes no space.
 But those links are hard links, not symbolic ones.

 After a while your running cluster will compact some of its sstables and
 writing it to a new one as deleting the old ones. Now for example you had
 SSTable1..4 and a snapshot with the links to those four after compaction you
 will have one active SSTable5 which is newly written and consumes space. The
 snapshot-linked ones are still there, still consuming their space. Only when
 this snapshot is cleared you get your disk space back.

 HTH,
 Jan







-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: [Import csv to Cassandra] Taking too much time

2014-12-04 Thread Yuki Morishita
Here's blog post about writing SSTables from CSV and using
SSTableLoader to load them.

http://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

On Thu, Dec 4, 2014 at 5:57 AM, 严超 yanchao...@gmail.com wrote:
 Thank you very much for your advice.
 Can you give me more advice for using SSTableLoader to import csv ?
 What is the best practice to use SStableLoader  importing csv in Cassandra ?

 Best Regards!
 Chao Yan
 --
 My twitter:Andy Yan @yanchao727
 My Weibo:http://weibo.com/herewearenow
 --

 2014-12-04 18:58 GMT+08:00 Akshay Ballarpure akshay.ballarp...@tcs.com:

 Hello Chao Yan,
 CSV data import using Copy command in cassandra is always painful for
 large size file (say  1Gig).
 CQL tool is not developed for performing such heavy operations instead try
 using SSTableLoader to import.


 Best Regards
 Akshay



 From:严超 yanchao...@gmail.com
 To:user@cassandra.apache.org
 Date:12/04/2014 02:11 PM
 Subject:[Import csv to Cassandra] Taking too much time
 



 Hi, Everyone:
 I'm importing a CSV file into Cassandra, and it always get
 error:Request did not complete within rpc_timeout , then I have to
 continue my COPY command of cql again. And the CSV file is 2.2 G . It is
 taking a long time.
 How can I speed up csv file importing ? Is there another way
 except COPY command of cql ?
 Thank you for any help .

 Best Regards!
 Chao Yan
 --
 My twitter:Andy Yan @yanchao727
 My Weibo:http://weibo.com/herewearenow
 --

 =-=-=
 Notice: The information contained in this e-mail
 message and/or attachments to it may contain
 confidential or privileged information. If you are
 not the intended recipient, any dissemination, use,
 review, distribution, printing or copying of the
 information contained in this e-mail message
 and/or attachments to it are strictly prohibited. If
 you have received this communication in error,
 please notify us by reply e-mail or telephone and
 immediately and permanently delete the message
 and any attachments. Thank you





-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: nodetool repair exception

2014-12-03 Thread Yuki Morishita
As the exception indicates, nodetool just lost communication with the
Cassandra node and cannot print progress any further.
Check your system.log on the node, and see if your repair was
completed. If there is no error, then it should be fine.

On Wed, Dec 3, 2014 at 5:08 AM, Rafał Furmański rfurman...@opera.com wrote:
 Hi All!

 We have a 8 nodes cluster in 2 DC (4 per DC, RF=3) running Cassandra 2.1.2 on 
 Linux Debian Wheezy.
 I executed “nodetool repair” on one of the nodes, and this command returned 
 following error:

 Exception occurred during clean-up. 
 java.lang.reflect.UndeclaredThrowableException
 error: JMX connection closed. You should check server log for repair status 
 of keyspace sync(Subsequent keyspaces are not going to be repaired).
 -- StackTrace --
 java.io.IOException: JMX connection closed. You should check server log for 
 repair status of keyspace sync(Subsequent keyspaces are not going to be 
 repaired).
at 
 org.apache.cassandra.tools.RepairRunner.handleNotification(NodeProbe.java:1351)
at 
 javax.management.NotificationBroadcasterSupport.handleNotification(NotificationBroadcasterSupport.java:274)
at 
 javax.management.NotificationBroadcasterSupport$SendNotifJob.run(NotificationBroadcasterSupport.java:339)
at 
 javax.management.NotificationBroadcasterSupport$1.execute(NotificationBroadcasterSupport.java:324)
at 
 javax.management.NotificationBroadcasterSupport.sendNotification(NotificationBroadcasterSupport.java:247)
at 
 javax.management.remote.rmi.RMIConnector.sendNotification(RMIConnector.java:441)
at 
 javax.management.remote.rmi.RMIConnector.access$1100(RMIConnector.java:121)
at 
 javax.management.remote.rmi.RMIConnector$RMIClientCommunicatorAdmin.gotIOException(RMIConnector.java:1505)
at 
 javax.management.remote.rmi.RMIConnector$RMINotifClient.fetchNotifs(RMIConnector.java:1350)
at 
 com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.fetchNotifs(ClientNotifForwarder.java:587)
at 
 com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:470)
at 
 com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:451)
at 
 com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:107)

 This error was followed by lots of “Lost Notification” messages.
 Node became unusable and I had to restart it. Is this an issue?

 Rafal



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Experiences with repairs using vnodes

2014-10-24 Thread Yuki Morishita
If anyone used incremental repair feature in 2.1 environment with
vnodes, I'd like to hear how it is doing.
Validation is the main time consuming part of repair, and it should be
much better after you switch to incremental.

I did some experiment regarding CASSANDRA-5220 like doing repairing
some ranges together, but it just hammers node hard with little gain.
So if incremental repair is working as expected, I'll happy to won't fix #5220.

On Fri, Oct 24, 2014 at 6:30 PM, Robert Coli rc...@eventbrite.com wrote:
 On Fri, Oct 24, 2014 at 3:54 PM, Jack Krupansky j...@basetechnology.com
 wrote:

 I was wondering if anybody had any specific experiences with repair and
 bootstrapping new nodes after switching to vnodes that they could share here
 (or email me privately.) I mean, how was the performance of repair and
 bootstrap impacted, cluster reliability, cluster load, ease of maintaining
 the cluster, more confidence in maintaining the cluster, or... whatever else
 may have been impacted. IOW, what actual benefit/change did you experience
 firsthand. Thanks!


 While I don't personally yet use vnodes, my understanding is...

 Repair gets much slower, bootstrapping gets faster and better distributed.

 https://issues.apache.org/jira/browse/CASSANDRA-5220

 Is a good starting point for the web of related JIRA tickets.

 =Rob
 http://twitter.com/rcolidba



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: sstableloader and ttls

2014-08-18 Thread Yuki Morishita
sstableloader just loads given SSTables as they are.
TTLed columns are sent and will be compacted at the destination node eventually.

On Sat, Aug 16, 2014 at 4:28 AM, Erik Forsberg forsb...@opera.com wrote:
 Hi!

 If I use sstableloader to load data to a cluster, and the source
 sstables contain some columns where the TTL has expired, i.e. the
 sstable has not yet been compacted - will those entries be properly
 removed on the destination side?

 Thanks,
 \EF



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: nodetool repair -snapshot option?

2014-06-30 Thread Yuki Morishita
Repair uses snapshot option by default since 2.0.2 (see NEWS.txt).
So you don't have to specify in your version.

Do you have stacktrace when OOMed?

On Mon, Jun 30, 2014 at 4:54 PM, Phil Burress philburress...@gmail.com wrote:
 We are running into an issue with nodetool repair. One or more of our nodes
 will die with OOM errors when running nodetool repair on a single node. Was
 reading this http://www.datastax.com/dev/blog/advanced-repair-techniques and
 it mentioned using the -snapshot option, however, that doesn't appear to be
 an option in the version we have. We are running 2.0.7 with vnodes. Any
 insight into what might be causing these OOMs and/or what version this
 -snapshot option is available in?

 Thanks!

 Phil



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: difference between AntiEntropySessions and AntiEntropyStage ?

2014-06-09 Thread Yuki Morishita
AntiEntropySessions is where all repair sessions are executed. You can
use this to count how many repair sessions are on going.
AntiEntropyStage is used to handle repair messages.

Usually you use AntiEntropySessions to check if repair is running.

On Mon, Jun 9, 2014 at 2:59 AM, DE VITO Dominique
dominique.dev...@thalesgroup.com wrote:
 Hi,



 Nodetool tpstats gives 2 lines for anti-entropy: one for AntiEntropySessions
 and one for AntiEntropyStage.



 What is the difference ?



 a)  Is “AntiEntropySessions” for counting repairs on a node acting as a
 primary node (the target node for repair) ?

 And is “AntiEntropyStage” for counting repair tasks on a node participating
 as a secondary node (not the target node for repair) ?



 b)  Or is it something different ? And then, what is the meaning of
 these two counter families ?



 Thanks.



 Regards,

 Dominique







-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Why repair -pr doesn't work when RF=0 for 1 DC

2014-02-27 Thread Yuki Morishita
Yes, it is expected behavior since
1.2.5(https://issues.apache.org/jira/browse/CASSANDRA-5424).
Since you set foobar not to replicate to stats dc, primary range of
foobar keyspace for nodes in stats is empty.


On Thu, Feb 27, 2014 at 10:16 AM, Fabrice Facorat
fabrice.faco...@gmail.com wrote:
 Hi,

 we have a cluster with 3 DC, and for one DC ( stats ), RF=0 for a
 keyspace using NetworkTopologyStrategy.

 cqlsh SELECT * FROM system.schema_keyspaces WHERE keyspace_name='foobar';

  keyspace_name  | durable_writes | strategy_class
  | strategy_options
 ++--+-
  foobar |   True |
 org.apache.cassandra.locator.NetworkTopologyStrategy |
 {s1:3,stats:0,b1:3}


 When doing a nodetool repair -pr foobar on a node in DC stats, we
 notice that the repair doesn't do anything : it just skips the
 keyspace.

 Is this normal behavior ? I guess that some keys belonging to DC
 stats's primary range token should have been repaired in the two
 others DC ? Am I wrong ?

 We are using cassandra 1.2.13, with 256 vnodes and Murmur3Partitioner


 --
 Close the World, Open the Net
 http://www.linux-wizard.net



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Why repair -pr doesn't work when RF=0 for 1 DC

2014-02-27 Thread Yuki Morishita
Yes.

On Thu, Feb 27, 2014 at 12:49 PM, Fabrice Facorat
fabrice.faco...@gmail.com wrote:
 So if I understand well from CASSANDRA-5424 and CASSANDRA-5608, as
 stats dc doesn't own data, repair -pr will not repair the data. Only a
 full repair will do it.

 Once we will add a RF to stats DC, repair -pr will work again. That's correct 
 ?

 2014-02-27 19:15 GMT+01:00 Yuki Morishita mor.y...@gmail.com:
 Yes, it is expected behavior since
 1.2.5(https://issues.apache.org/jira/browse/CASSANDRA-5424).
 Since you set foobar not to replicate to stats dc, primary range of
 foobar keyspace for nodes in stats is empty.


 On Thu, Feb 27, 2014 at 10:16 AM, Fabrice Facorat
 fabrice.faco...@gmail.com wrote:
 Hi,

 we have a cluster with 3 DC, and for one DC ( stats ), RF=0 for a
 keyspace using NetworkTopologyStrategy.

 cqlsh SELECT * FROM system.schema_keyspaces WHERE keyspace_name='foobar';

  keyspace_name  | durable_writes | strategy_class
  | strategy_options
 ++--+-
  foobar |   True |
 org.apache.cassandra.locator.NetworkTopologyStrategy |
 {s1:3,stats:0,b1:3}


 When doing a nodetool repair -pr foobar on a node in DC stats, we
 notice that the repair doesn't do anything : it just skips the
 keyspace.

 Is this normal behavior ? I guess that some keys belonging to DC
 stats's primary range token should have been repaired in the two
 others DC ? Am I wrong ?

 We are using cassandra 1.2.13, with 256 vnodes and Murmur3Partitioner


 --
 Close the World, Open the Net
 http://www.linux-wizard.net



 --
 Yuki Morishita
  t:yukim (http://twitter.com/yukim)



 --
 Close the World, Open the Net
 http://www.linux-wizard.net



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Data tombstoned during bulk loading 1.2.10 - 2.0.3

2014-02-03 Thread Yuki Morishita
if you are using  2.0.4, then you are hitting
https://issues.apache.org/jira/browse/CASSANDRA-6527


On Mon, Feb 3, 2014 at 2:51 AM, olek.stas...@gmail.com
olek.stas...@gmail.com wrote:
 Hi All,
 We've faced very similar effect after upgrade from 1.1.7 to 2.0 (via
 1.2.10). Probably after upgradesstable  (but it's only a guess,
 because we noticed problem few weeks later), some rows became
 tombstoned. They just disappear from results of queries. After
 inverstigation I've noticed, that they are reachable via sstable2json.
 Example output for non-existent row:

 {key: 6e6e37716c6d665f6f61695f6463,metadata: {deletionInfo:
 {markedForDeleteAt:2201170739199,localDeletionTime:0}},columns:
 [[DATA,3c6f61695f64633a64(...),1357677928108]]}
 ]

 If I understand correctly row is marked as deleted with timestamp in
 the far future, but it's still on the disk. Also localDeletionTime is
 set to 0, which may means, that it's kind of internal bug, not effect
 of client error. So my question is: is it true, that upgradesstable
 may do soemthing like that? How to find reasons for such strange
 cassandra behaviour? Is there any option of recovering such strange
 marked nodes?
 This problem touches about 500K rows of all 14M in our database, so
 the percentage is quite big.
 best regards
 Aleksander

 2013-12-12 Robert Coli rc...@eventbrite.com:
 On Wed, Dec 11, 2013 at 6:27 AM, Mathijs Vogelzang math...@apptornado.com
 wrote:

 When I use sstable2json on the sstable on the destination cluster, it has
 metadata: {deletionInfo:
 {markedForDeleteAt:1796952039620607,localDeletionTime:0}}, whereas
 it doesn't have that in the source sstable.
 (Yes, this is a timestamp far into the future. All our hosts are
 properly synced through ntp).


 This seems like a bug in sstableloader, I would report it on JIRA.


 Naturally, copying the data again doesn't work to fix it, as the
 tombstone is far in the future. Apart from not having this happen at
 all, how can it be fixed?


 Briefly, you'll want to purge that tombstone and then reload the data with a
 reasonable timestamp.

 Dealing with rows with data (and tombstones) in the far future is described
 in detail here :

 http://thelastpickle.com/blog/2011/12/15/Anatomy-of-a-Cassandra-Partition.html

 =Rob




-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Frustration with repair process in 1.1.11

2013-11-02 Thread Yuki Morishita
Hi, Oleg,

As you encountered already, many people was complaining about repair,
so we have been working actively to mitigate the problem.

 I run repair (with -pr) on DC2. First time I run it it gets *stuck* (i.e. 
 frozen) within the first 30 seconds, with no error or any sort of message

You said you are on 1.1.11, so
https://issues.apache.org/jira/browse/CASSANDRA-5393 comes up to my
mind at first.
The issue was that some messages including repair ones got lost
silently, so we added retry mechanism.

 I then run it again -- and it completes in seconds on each node, with about 
 50 gigs of data on each.

Cassandra calculates replica difference using Merkle Tree built from
each row's hash value.
Cassandra used to detect many differences if you do a lot of delete or
have a lot of TTLed columns.
In 1.2, we fixed this issue(such in
https://issues.apache.org/jira/browse/CASSANDRA-4905), so if this is
the case, upgrading would save space.

 Is there any improvement and clarity in 1.2 ? How about 2.0 ?

Yes. The reason that repair hang prior to 2.0 is either 1) Merkle Tree
creation failure(validation failure), or 2) streaming failure, without
the failure node report back to the coordinator.
To fix this, we redesign the message flow around repair process in
https://issues.apache.org/jira/browse/CASSANDRA-5426.
At the same time, we improved data streaming among nodes by, again,
redesigning the streaming
protocol(http://www.datastax.com/dev/blog/streaming-in-cassandra-2-0).

So, with those said, Cassandra 2.0.x have much improved compared to 1.1.11.

Hope this helps,

Yuki

On Fri, Nov 1, 2013 at 2:15 PM, Oleg Dulin oleg.du...@gmail.com wrote:
 First I need to vent.


 rant

 One of my cassandra cluster is a dual data center setup, with DC1 acting as
 primary, and DC2 acting as a hot backup.


 Well, guess what ? I am pretty sure that it falls behind on replication. So
 I am told I need to run repair.


 I run repair (with -pr) on DC2. First time I run it it gets *stuck* (i.e.
 frozen) within the first 30 seconds, with no error or any sort of message. I
 then run it again -- and it completes in seconds on each node, with about 50
 gigs of data on each.


 That seems suspicious, so I do some research.


 I am told on IRC that running repair -pr will only do the repair on 100
 tokens (the offset from DC1 to DC2)… Seriously ???


 Repair process is, indeed, a joke:
 https://issues.apache.org/jira/browse/CASSANDRA-5396 . Repair is the worst
 thing you can do to your cluster, it consumes enormous resources, and can
 leave your cluster in an inconsistent state. Oh and by the way you must run
 it every week…. Whoever invented that process must not live in a real world,
 with real applications.

 /rant


 No… lets have a constructive conversation.


 How do I know, with certainty, that my DC2 cluster is up to date on
 replication ? I have a few options:


 1) I set read repair chance to 100% on critical column families and I write
 a tool to scan every CF, every column of every row. This strikes me as very
 silly.

 Q1: Do I need to scan every column or is looking at one column enough to
 trigger a read repair ?


 2) Can someone explain to me how the repair works such that I don't totally
 trash my cluster or spill into work week ?


 Is there any improvement and clarity in 1.2 ? How about 2.0 ?




 --

 Regards,

 Oleg Dulin

 http://www.olegdulin.com



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: [Cas 2.0.2] Looping Repair since activating PasswordAuthenticator

2013-11-02 Thread Yuki Morishita
 bestimmungsgemäßen Adressaten
 ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt
 auf welche Weise auch immer zu verwenden.

 This E-Mail may contain confidential and/or privileged information. If you
 are not the intended recipient of this E-Mail, you are hereby notified that
 saving, distribution or use of the content of this E-Mail in any way is
 prohibited. If you have received this E-Mail in error, please notify the
 sender and delete the E-Mail.



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: AssertionError: sstableloader

2013-09-19 Thread Yuki Morishita
Sounds like a bug.
Would you mind filing JIRA at https://issues.apache.org/jira/browse/CASSANDRA?

Thanks,

On Thu, Sep 19, 2013 at 2:12 PM, Vivek Mishra mishra.v...@gmail.com wrote:
 Hi,
 I am trying to use sstableloader to load some external data and getting
 given below error:
 Established connection to initial hosts
 Opening sstables and calculating sections to stream
 Streaming relevant part of
 /home/impadmin/source/Examples/data/Demo/Users/Demo-Users-ja-1-Data.db to
 [/127.0.0.1]
 progress: [/127.0.0.1 1/1 (100%)] [total: 100% - 0MB/s (avg:
 0MB/s)]Exception in thread STREAM-OUT-/127.0.0.1 java.lang.AssertionError:
 Reference counter -1 for
 /home/impadmin/source/Examples/data/Demo/Users/Demo-Users-ja-1-Data.db
 at
 org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:1017)
 at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:120)
 at
 org.apache.cassandra.streaming.messages.FileMessage$1.serialize(FileMessage.java:73)
 at
 org.apache.cassandra.streaming.messages.FileMessage$1.serialize(FileMessage.java:45)
 at
 org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
 at
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:384)
 at
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:357)
 at java.lang.Thread.run(Thread.java:722)


 Any pointers?

 -Vivek



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Automatic tombstone compaction

2013-08-21 Thread Yuki Morishita
Tamas,

If there are rows with the same key in other SSTables, that rows won't
be deleted.
Tombstone compaction make guess if it can actually drop safely by
scanning overlap with other SSTables.
Do you have many rows in your large SSTable?
If you don't, then chance to run tombstone compaction may be low.


On Wed, Aug 21, 2013 at 9:33 AM,  tamas.fold...@thomsonreuters.com wrote:
 Hi,



 I ran upgradesstables as part of the Cassandra upgrade, before issuing the
 CQL alter command.

 According to the docs, SizeTieredCompactionStrategy is fine (that is what I
 used, and plan on continue using), and automatic tombstone compaction is
 available for it:

 http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/ops_about_config_compact_c.html

 I just had to include the ‘class’ in the alter statement, otherwise it would
 not accept my command.

 Is that not right?



 Thanks,

 Tamas



 From: Haithem Jarraya [mailto:a-hjarr...@expedia.com]
 Sent: 21. august 2013 16:24
 To: user@cassandra.apache.org
 Subject: Re: Automatic tombstone compaction



 Hi,



 do you mean LeveledCompactionStrategy?



 Also you will need to run nodetool upgradesstables  [keyspace][cf_name]
 after changing the compaction strategy.



 Thanks,



 Haithem Jarraya

 On 21 Aug 2013, at 15:15, tamas.fold...@thomsonreuters.com wrote:



 Hi,



 After upgrading from 1.0 to 1.2, I wanted to make use of the automatic
 tombstone compaction feature, so using CQL3 I issued:



 ALTER TABLE versions WITH compaction = {'class' :
 'SizeTieredCompactionStrategy', 'min_threshold' : 4, 'max_threshold' : 32,
 'tombstone_compaction_interval' : 1, 'tombstone_threshold' : '0.1'};



 But I still see no trace that would suggest this works – we had 60G of data
 with TTL=1week pushed a while ago to the test cluster, the majority of it
 should be expired  compacted away by now. Not sure if it is relevant, but
 this old data is in one ~60G file + I have a few smaller files with latest
 data in them.

 Looking at JMX: DroppableTombstoneRatio = 0.892076544, which seems to back
 my theory.

 Am I doing something wrong, or am I expecting the wrong thing?



 Thanks,

 Tamas







-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Compression ratio

2013-07-12 Thread Yuki Morishita
it's compressed/original.

https://github.com/apache/cassandra/blob/cassandra-1.1.11/src/java/org/apache/cassandra/io/sstable/SSTableMetadata.java#L124

On Fri, Jul 12, 2013 at 10:02 AM, cem cayiro...@gmail.com wrote:
 Hi All,

 Can anyone explain the compression ratio?

 Is it the compressed data / original or original/ compressed ? Or
 something else.

 thanks a lot.

 Best Regards,
 Cem



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: exception causes streaming to hang forever

2013-05-24 Thread Yuki Morishita
hmm, I only can say it may caused by corrupt SSTable...
Stream hang on unexpected error was fixed in 1.2.5
(https://issues.apache.org/jira/browse/CASSANDRA-5229).

On Fri, May 24, 2013 at 6:56 AM, Hiller, Dean dean.hil...@nrel.gov wrote:
 The exception on that node was just this

 ERROR [Thread-6056] 2013-05-22 14:47:59,416 CassandraDaemon.java (line
 132) Exception in thread Thread[Thread-6056,5,main]
 java.lang.IndexOutOfBoundsException
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:75)
 at
 org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMay
 Throw(CompressedInputStream.java:151)
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at java.lang.Thread.run(Thread.java:662)




 On 5/23/13 9:51 AM, Yuki Morishita mor.y...@gmail.com wrote:

What kind of error does the other end of streaming(/10.10.42.36) say?

On Wed, May 22, 2013 at 5:19 PM, Hiller, Dean dean.hil...@nrel.gov
wrote:
 We had 3 nodes roll on good and the next 2, we see a remote node with
this exception every time we start over and bootstrap the node

 ERROR [Streaming to /10.10.42.36:2] 2013-05-22 14:47:59,404
CassandraDaemon.java (line 132) Exception in thread Thread[Streaming to
/10.10.42.36:2,5,main]
 java.lang.RuntimeException: java.io.IOException: Input/output error
 at
com.google.common.base.Throwables.propagate(Throwables.java:160)
 at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor
.java:895)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav
a:918)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: Input/output error
 at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
 at
sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:405)
 at
sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:506)
 at
org.apache.cassandra.streaming.compress.CompressedFileStreamTask.stream(C
ompressedFileStreamTask.java:90)
 at
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.
java:91)
 at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 ... 3 more

 Are there any ideas what this is?  Google doesn't real show any useful
advice on this and our node has not joined the ring yet so I don't think
we can run a repair just yet to avoid it and try synching via another
means.  It seems on a streaming failure, it never recovers from this.
Any ideas?

 We are on cassandra 1.2.2

 Thanks,
 Dean




--
Yuki Morishita
 t:yukim (http://twitter.com/yukim)




-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Cassandra 1.2 TTL histogram problem

2013-05-23 Thread Yuki Morishita
 Are you sure that it is a good idea to estimate remainingKeys like that?

Since we don't want to scan every row to check overlap and cause heavy
IO automatically, the method can only do the best-effort type of
calculation.
In your case, try running user defined compaction on that sstable
file. It goes through every row and remove tombstones when droppable.


On Wed, May 22, 2013 at 11:48 AM, cem cayiro...@gmail.com wrote:
 Thanks for the answer.

 It means that if we use randompartioner it will be very difficult to  find a
 sstable without any overlap.

 Let me give you an example from my test.

 I have ~50 sstables in total and an sstable with droppable ratio 0.9. I use
 GUID for key and only insert (no update -delete) so I dont expect a key in
 different sstables.

 I put extra logging to  AbstractCompactionStrategy to see the
 overlaps.size() and keys and remainingKeys:

 overlaps.size() is around 30, number of keys for that sstable is around 5 M
 and remainingKeys is always 0.

 Are you sure that it is a good idea to estimate remainingKeys like that?

 Best Regards,
 Cem



 On Wed, May 22, 2013 at 5:58 PM, Yuki Morishita mor.y...@gmail.com wrote:

  Can method calculate non-overlapping keys as overlapping?

 Yes.
 And randomized keys don't matter here since sstables are sorted by
 token calculated from key by your partitioner, and the method uses
 sstable's min/max token to estimate overlap.

 On Tue, May 21, 2013 at 4:43 PM, cem cayiro...@gmail.com wrote:
  Thank you very much for the swift answer.
 
  I have one more question about the second part. Can method calculate
  non-overlapping keys as overlapping? I mean it uses max and min tokens
  and
  column count. They can be very close to each other if random keys are
  used.
 
  In my use case I generate a GUID for each key and send a single write
  request.
 
  Cem
 
  On Tue, May 21, 2013 at 11:13 PM, Yuki Morishita mor.y...@gmail.com
  wrote:
 
   Why does Cassandra single table compaction skips the keys that are in
   the other sstables?
 
  because we don't want to resurrect deleted columns. Say, sstable A has
  the column with timestamp 1, and sstable B has the same column which
  deleted at timestamp 2. Then if we purge that column only from sstable
  B, we would see the column with timestamp 1 again.
 
   I also dont understand why we have this line in
   worthDroppingTombstones
   method
 
  What the method is trying to do is to guess how many columns that
  are not in the rows that don't overlap, without actually going through
  every rows in the sstable. We have statistics like column count
  histogram, min and max row token for every sstables, we use those in
  the method to estimate how many columns the two sstables overlap.
  You may have remainingColumnsRatio of 0 when the two sstables overlap
  almost entirely.
 
 
  On Tue, May 21, 2013 at 3:43 PM, cem cayiro...@gmail.com wrote:
   Hi all,
  
   I have a question about ticket
   https://issues.apache.org/jira/browse/CASSANDRA-3442
  
   Why does Cassandra single table compaction skips the keys that are in
   the
   other sstables? Please correct if I am wrong.
  
   I also dont understand why we have this line in
   worthDroppingTombstones
   method:
  
   double remainingColumnsRatio = ((double) columns) /
   (sstable.getEstimatedColumnCount().count() *
   sstable.getEstimatedColumnCount().mean());
  
   remainingColumnsRatio  is always 0 in my case and the droppableRatio
   is
   0.9. Cassandra skips all sstables which are already expired.
  
   This line was introduced by
   https://issues.apache.org/jira/browse/CASSANDRA-4022.
  
   Best Regards,
   Cem
 
 
 
  --
  Yuki Morishita
   t:yukim (http://twitter.com/yukim)
 
 



 --
 Yuki Morishita
  t:yukim (http://twitter.com/yukim)





-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: exception causes streaming to hang forever

2013-05-23 Thread Yuki Morishita
What kind of error does the other end of streaming(/10.10.42.36) say?

On Wed, May 22, 2013 at 5:19 PM, Hiller, Dean dean.hil...@nrel.gov wrote:
 We had 3 nodes roll on good and the next 2, we see a remote node with this 
 exception every time we start over and bootstrap the node

 ERROR [Streaming to /10.10.42.36:2] 2013-05-22 14:47:59,404 
 CassandraDaemon.java (line 132) Exception in thread Thread[Streaming to 
 /10.10.42.36:2,5,main]
 java.lang.RuntimeException: java.io.IOException: Input/output error
 at com.google.common.base.Throwables.propagate(Throwables.java:160)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: Input/output error
 at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
 at 
 sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:405)
 at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:506)
 at 
 org.apache.cassandra.streaming.compress.CompressedFileStreamTask.stream(CompressedFileStreamTask.java:90)
 at 
 org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 ... 3 more

 Are there any ideas what this is?  Google doesn't real show any useful advice 
 on this and our node has not joined the ring yet so I don't think we can run 
 a repair just yet to avoid it and try synching via another means.  It seems 
 on a streaming failure, it never recovers from this.  Any ideas?

 We are on cassandra 1.2.2

 Thanks,
 Dean




-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Cassandra 1.2 TTL histogram problem

2013-05-22 Thread Yuki Morishita
 Can method calculate non-overlapping keys as overlapping?

Yes.
And randomized keys don't matter here since sstables are sorted by
token calculated from key by your partitioner, and the method uses
sstable's min/max token to estimate overlap.

On Tue, May 21, 2013 at 4:43 PM, cem cayiro...@gmail.com wrote:
 Thank you very much for the swift answer.

 I have one more question about the second part. Can method calculate
 non-overlapping keys as overlapping? I mean it uses max and min tokens and
 column count. They can be very close to each other if random keys are used.

 In my use case I generate a GUID for each key and send a single write
 request.

 Cem

 On Tue, May 21, 2013 at 11:13 PM, Yuki Morishita mor.y...@gmail.com wrote:

  Why does Cassandra single table compaction skips the keys that are in
  the other sstables?

 because we don't want to resurrect deleted columns. Say, sstable A has
 the column with timestamp 1, and sstable B has the same column which
 deleted at timestamp 2. Then if we purge that column only from sstable
 B, we would see the column with timestamp 1 again.

  I also dont understand why we have this line in worthDroppingTombstones
  method

 What the method is trying to do is to guess how many columns that
 are not in the rows that don't overlap, without actually going through
 every rows in the sstable. We have statistics like column count
 histogram, min and max row token for every sstables, we use those in
 the method to estimate how many columns the two sstables overlap.
 You may have remainingColumnsRatio of 0 when the two sstables overlap
 almost entirely.


 On Tue, May 21, 2013 at 3:43 PM, cem cayiro...@gmail.com wrote:
  Hi all,
 
  I have a question about ticket
  https://issues.apache.org/jira/browse/CASSANDRA-3442
 
  Why does Cassandra single table compaction skips the keys that are in
  the
  other sstables? Please correct if I am wrong.
 
  I also dont understand why we have this line in worthDroppingTombstones
  method:
 
  double remainingColumnsRatio = ((double) columns) /
  (sstable.getEstimatedColumnCount().count() *
  sstable.getEstimatedColumnCount().mean());
 
  remainingColumnsRatio  is always 0 in my case and the droppableRatio  is
  0.9. Cassandra skips all sstables which are already expired.
 
  This line was introduced by
  https://issues.apache.org/jira/browse/CASSANDRA-4022.
 
  Best Regards,
  Cem



 --
 Yuki Morishita
  t:yukim (http://twitter.com/yukim)





-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Cassandra 1.2 TTL histogram problem

2013-05-21 Thread Yuki Morishita
 Why does Cassandra single table compaction skips the keys that are in the 
 other sstables?

because we don't want to resurrect deleted columns. Say, sstable A has
the column with timestamp 1, and sstable B has the same column which
deleted at timestamp 2. Then if we purge that column only from sstable
B, we would see the column with timestamp 1 again.

 I also dont understand why we have this line in worthDroppingTombstones method

What the method is trying to do is to guess how many columns that
are not in the rows that don't overlap, without actually going through
every rows in the sstable. We have statistics like column count
histogram, min and max row token for every sstables, we use those in
the method to estimate how many columns the two sstables overlap.
You may have remainingColumnsRatio of 0 when the two sstables overlap
almost entirely.


On Tue, May 21, 2013 at 3:43 PM, cem cayiro...@gmail.com wrote:
 Hi all,

 I have a question about ticket
 https://issues.apache.org/jira/browse/CASSANDRA-3442

 Why does Cassandra single table compaction skips the keys that are in the
 other sstables? Please correct if I am wrong.

 I also dont understand why we have this line in worthDroppingTombstones
 method:

 double remainingColumnsRatio = ((double) columns) /
 (sstable.getEstimatedColumnCount().count() *
 sstable.getEstimatedColumnCount().mean());

 remainingColumnsRatio  is always 0 in my case and the droppableRatio  is
 0.9. Cassandra skips all sstables which are already expired.

 This line was introduced by
 https://issues.apache.org/jira/browse/CASSANDRA-4022.

 Best Regards,
 Cem



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: index_interval file size is the same after modifying 128 to 512?

2013-03-22 Thread Yuki Morishita
Index.db file always contains *all* position of the keys in data file.
index_interval is the rate that the position of the key in index file is store 
in memory.
So that C* can begin scanning index file from closest position.



On Friday, March 22, 2013 at 11:17 AM, Hiller, Dean wrote:

 I was just curious. Our RAM has significantly reduced but the *Index.db files 
 are the same size size as before.
  
 Any ideas why this would be the case?
  
 Basically, Why is our disk size not reduced since RAM is way lower? We are 
 running strong now with 512 index_interval for past 2-3 days and RAM never 
 looked better. We were pushing 10G before and now we are 2G slowing 
 increasing to 8G before gc compacts the long lived stuff which goes back down 
 to 2G again…..very pleased with LCS in our system!
  
 Thanks,
 Dean
  
  




Re: Cassandra 1.2.1 adding new node

2013-03-16 Thread Yuki Morishita
Try upgrading to 1.2.2. There was streaming bug that could cause bootstrap
to fail.

On Friday, March 15, 2013, Daning Wang wrote:

 I tried to add new node to ring, it is supposed to be fast in 1.2(256
 tokens on each node), but it is 8+ hours now. after showing bootstraping,
 now cpu usage is very low,  I turned on debug, it shows applying mutation.
 is that normal?

  INFO [main] 2013-03-15 08:36:44,530 StorageService.java (line 853)
 JOINING: Starting to bootstrap...
  ...

 DEBUG [MutationStage:600] 2013-03-15 14:32:33,286
 RowMutationVerbHandler.java (line 40) Applying mutation
 DEBUG [MutationStage:601] 2013-03-15 14:32:33,523
 RowMutationVerbHandler.java (line 40) Applying mutation
 DEBUG [MutationStage:601] 2013-03-15 14:32:33,525
 AbstractSimplePerColumnSecondaryIndex.java (line 118) applying index row
 monago-martires.blogspot.com in
 ColumnFamily(dsatcache.dsatcache_top_node_idx
 [97f154dd18d1ab2e:false:0@1363383153612004,])
 DEBUG [MutationStage:603] 2013-03-15 14:32:33,525
 RowMutationVerbHandler.java (line 40) Applying mutation
 DEBUG [MutationStage:602] 2013-03-15 14:32:33,643
 RowMutationVerbHandler.java (line 40) Applying mutation
 DEBUG [MutationStage:602] 2013-03-15 14:32:33,643
 AbstractSimplePerColumnSecondaryIndex.java (line 118) applying index row
 us.countryproducts.safestchina.com

 Thanks,

 Daning



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: leveled compaction

2013-03-08 Thread Yuki Morishita
It is SSTable counts in each level.  
 SSTables in each level: [40/4, 442/10, 97, 967, 7691, 0, 0, 0]

So you have 40 SSTables in L0, 442 in L1, 97 in L2 and so forth.
'40/4' and '442/10' have numbers after slash, those are expected maximum number 
of
SSTables in that level and only displayed when you have more than that 
threshold.



On Friday, March 8, 2013 at 3:24 PM, Kanwar Sangha wrote:

 Hi –  
   
 Can someone explain the meaning for the levelled compaction in cfstats –
   
 SSTables in each level: [40/4, 442/10, 97, 967, 7691, 0, 0, 0]
   
 SSTables in each level: [61/4, 9, 92, 945, 8146, 0, 0, 0]
   
 SSTables in each level: [34/4, 1000/10, 100, 953, 8184, 0, 0, 0
   
   
 Thanks,
 Kanwar
   
  
  
  




Re: leveled compaction

2013-03-08 Thread Yuki Morishita
no. sstables are eventually compacted and moved to next level.

On Friday, March 8, 2013, Kanwar Sangha wrote:

  Cool ! So of we exceed the threshold, is that an issue… ?

 ** **

 *From:* Yuki Morishita [mailto:mor.y...@gmail.com javascript:_e({},
 'cvml', 'mor.y...@gmail.com');]
 *Sent:* 08 March 2013 15:57
 *To:* user@cassandra.apache.org javascript:_e({}, 'cvml',
 'user@cassandra.apache.org');
 *Subject:* Re: leveled compaction

 ** **

 It is SSTable counts in each level. 

 SSTables in each level: [40/4, 442/10, 97, 967, 7691, 0, 0, 0]

  So you have 40 SSTables in L0, 442 in L1, 97 in L2 and so forth.

 '40/4' and '442/10' have numbers after slash, those are expected maximum
 number of

 SSTables in that level and only displayed when you have more than that
 threshold.

 On Friday, March 8, 2013 at 3:24 PM, Kanwar Sangha wrote:

   Hi – 

  

 Can someone explain the meaning for the levelled compaction in cfstats –**
 **

  

 SSTables in each level: [40/4, 442/10, 97, 967, 7691, 0, 0, 0]

  

 SSTables in each level: [61/4, 9, 92, 945, 8146, 0, 0, 0]

  

 SSTables in each level: [34/4, 1000/10, 100, 953, 8184, 0, 0, 0

  

  

 Thanks,

 Kanwar

  

  ** **



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Startup Exception During Upgrade 1.1.6 to 1.2.2 during LCS Migration and Corrupt Tables

2013-03-08 Thread Yuki Morishita
Are you sure you are using 1.2.2?
Because LegacyLeveledManifest is from unreleased development version.


On Friday, March 8, 2013 at 11:02 PM, Arya Goudarzi wrote:

 Hi,
 
 I am exercising the rolling upgrade from 1.1.6 to 1.2.2. When I upgraded to 
 1.2.2 on the first node, during startup I got this exception:
 
 ERROR [main] 2013-03-09 04:24:30,771 CassandraDaemon.java (line 213) Could 
 not migrate old leveled manifest. Move away the .json file in the data 
 directory
 java.io.EOFException
 at java.io.DataInputStream.readInt(DataInputStream.java:375)
 at 
 org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:265)
 at 
 org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:365)
 at 
 org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:351)
 at 
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:100)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:209)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:391)
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:434)  
 
 This is when it is trying to migrate LCS I believe. I removed the Json files 
 from data directories:
 
 data/ $ for i in `find . | grep 'json' | awk '{print $1}'`; do rm -rf $i; 
 done 
 
 Then during the second attempt at restart, I got the following exception:
 
 ERROR [main] 2013-03-09 04:24:30,771 CassandraDaemon.java (line 213) Could 
 not migrate old leveled manifest. Move away the .json file in the data 
 directory
 java.io.EOFException
 at java.io.DataInputStream.readInt(DataInputStream.java:375)
 at 
 org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:265)
 at 
 org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:365)
 at 
 org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:351)
 at 
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:100)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:209)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:391)
 at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:434)
 
 OK. I seems it created snapshots prior to migration step. So it is safe to 
 remove those, right? 
 
 data/ $ for i in `find . | grep 'pre-sstablemetamigration' | awk '{print 
 $1}'`; do rm -rf $i; done  
 
 Now startup again, but I see bunch of corrupt sstable logs messages: 
 
 ERROR [SSTableBatchOpen:1] 2013-03-09 04:55:39,826 SSTableReader.java (line 
 242) Corrupt sstable 
 /var/lib/cassandra/data/keyspace_production/UniqueIndexes/keyspace_production-UniqueIndexes-hf-98318=[Filter.db,
  Data.db, CompressionInfo.db, Statistics.db, Index.db]; skipped 
 java.io.EOFException
 at java.io.DataInputStream.readInt(DataInputStream.java:375)
 at 
 org.apache.cassandra.utils.BloomFilterSerializer.deserialize(BloomFilterSerializer.java:45)
 at 
 org.apache.cassandra.utils.Murmur2BloomFilter$Murmur2BloomFilterSerializer.deserialize(Murmur2BloomFilter.java:40)
 at org.apache.cassandra.utils.FilterFactory.deserialize(FilterFactory.java:71)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.loadBloomFilter(SSTableReader.java:334)
 at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:199)
 at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:149)
 at org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:238)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
 at java.lang.Thread.run(Thread.java:662)
 
 This is worrisome. How should I deal with this situation? scrub maybe? Should 
 I open a bug?
 
 Cheers,
 Arya
 
 
 




Re: multiple reducers with BulkOutputFormat on the same host

2013-01-24 Thread Yuki Morishita
Alexel,

You were right.
It was already fixed to use UUID for streaming session and released in 1.2.0.
See https://issues.apache.org/jira/browse/CASSANDRA-4813.


On Thursday, January 24, 2013 at 6:49 AM, Alexei Bakanov wrote:

 Hello,
 
 We see that BulkOutputFormat fails to stream data from multiple reduce
 instances that run on the same host.
 We get the same error messages that issue
 https://issues.apache.org/jira/browse/CASSANDRA-4223 tries to address.
 Looks like (ip-adress + in_out_flag + atomic integer) is not unique
 enough for a sessionId when we have multiple JVMs streaming from one
 physical host.
 
 We get the problem fixed by setting one reducer per machine in hadoop
 config, but it's not an option we want to deploy.
 
 Thanks,
 Alexei Bakanov
 
 




Re: bulk load problem

2012-07-09 Thread Yuki Morishita
Due to the change in directory structure from ver 1.1, you have to create the 
directory like 

/path/to/sstables/Keyspace name/ColumnFamily name

and put your sstables.

In your case, I think it would be /data/ssTable/tpch/tpch/cf0. 
And you have to specify that directory as a parameter for sstableloader

bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/cf0

Yuki


On Tuesday, June 26, 2012 at 7:07 PM, James Pirz wrote:

 Dear all,
 
 I am trying to use sstableloader in cassandra 1.1.1, to bulk load some data 
 into a single node cluster.
 I am running the following command:
 
 bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/
 
 from another node (other than the node on which cassandra is running), 
 while the data should be loaded into a keyspace named tpch. I made sure 
 that the 2nd node, from which I run sstableloader, have the same copy of 
 cassandra.yaml as the destination node.
 I have put 
 
 tpch-cf0-hd-1-Data.db
 tpch-cf0-hd-1-Index.db
 
 under the path, I have passed to sstableloader.
 
 But I am getting the following error:
 
 Could not retrieve endpoint ranges:
 
 Any hint ?
 
 Thanks in advance,
 
 James
 
 
 



Re: ClassCastException during Cassandra server startup

2012-07-02 Thread Yuki Morishita
Thierry,

Key cache files are stored inside your saved_caches_directory defined in 
cassandra.yaml, which has default value of /var/lib/cassandra/saved_caches. 


Yuki


On Monday, July 2, 2012 at 4:00 AM, Thierry Templier wrote:

 Hello Yuki,
 
 Could you give me hints about where to find these files. I have a look in the 
 installation folder of Cassandra and in the /var/lib/cassandra folder?
 
 Thanks very much for your help.
 Thierry
  That was bug in 1.1.1 and fixed in 
  https://issues.apache.org/jira/browse/CASSANDRA-4331. 
  Workaround is deleting the key cache files for your index CFs should fix 
  this.
  
  
  
  Yuki



Re: ClassCastException during Cassandra server startup

2012-06-29 Thread Yuki Morishita
That was bug in 1.1.1 and fixed in 
https://issues.apache.org/jira/browse/CASSANDRA-4331.
Workaround is deleting the key cache files for your index CFs should fix this.



Yuki


On Friday, June 29, 2012 at 10:02 AM, Thierry Templier wrote:

 Hello,
 
 My problem seems to occur after a server restart. As a matter of fact, 
 if I clean the data, create a new keyspace and and its structure with 
 cqlsh, I can use the database correctly (both with cqlsh and a Java 
 application with Astyanax). If I stop the server and restart it, I have 
 my problem and then my requests don't work anymore (for example, 
 requests with where clause).
 
 Thanks for your help!
 Thierry
 
  Hello,
  
  When I start the Cassandra server, some exceptions occur:
  
  INFO 10:22:16,014 reading saved cache 
  /var/lib/cassandra/saved_caches/apispark-CellMessage-KeyCache
  INFO 10:22:16,016 Opening 
  /var/lib/cassandra/data/apispark/CellMessage/apispark-CellMessage-hd-2 
  (498 bytes)
  INFO 10:22:16,016 Opening 
  /var/lib/cassandra/data/apispark/CellMessage/apispark-CellMessage-hd-1 
  (635 bytes)
  INFO 10:22:16,041 Creating new index : 
  ColumnDefinition{name=76657273696f6e, 
  validator=org.apache.cassandra.db.marshal.UTF8Type, index_type=KEYS, 
  index_name='cellmessage_version'}
  INFO 10:22:16,045 reading saved cache 
  /var/lib/cassandra/saved_caches/apispark-CellMessage.cellmessage_version-KeyCache
  INFO 10:22:16,066 Opening 
  /var/lib/cassandra/data/apispark/CellMessage/apispark-CellMessage.cellmessage_version-hd-2
   
  (349 bytes)
  INFO 10:22:16,066 Opening 
  /var/lib/cassandra/data/apispark/CellMessage/apispark-CellMessage.cellmessage_version-hd-1
   
  (401 bytes)
  ERROR 10:22:16,068 Exception in thread Thread[SSTableBatchOpen:1,5,main]
  java.lang.ClassCastException: java.math.BigInteger cannot be cast to 
  java.nio.ByteBuffer
  at org.apache.cassandra.db.marshal.UTF8Type.compare(UTF8Type.java:27)
  at org.apache.cassandra.dht.LocalToken.compareTo(LocalToken.java:45)
  at 
  org.apache.cassandra.db.DecoratedKey.compareTo(DecoratedKey.java:89)
  at 
  org.apache.cassandra.db.DecoratedKey.compareTo(DecoratedKey.java:38)
  at java.util.TreeMap.getEntry(TreeMap.java:345)
  at java.util.TreeMap.containsKey(TreeMap.java:226)
  at java.util.TreeSet.contains(TreeSet.java:234)
  at 
  org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:396)
  at 
  org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:187)
  at 
  org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:225)
  at 
  java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at 
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at 
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:636)
  ERROR 10:22:16,071 Exception in thread Thread[SSTableBatchOpen:2,5,main]
  java.lang.ClassCastException: java.math.BigInteger cannot be cast to 
  java.nio.ByteBuffer
  at org.apache.cassandra.db.marshal.UTF8Type.compare(UTF8Type.java:27)
  at org.apache.cassandra.dht.LocalToken.compareTo(LocalToken.java:45)
  at 
  org.apache.cassandra.db.DecoratedKey.compareTo(DecoratedKey.java:89)
  at 
  org.apache.cassandra.db.DecoratedKey.compareTo(DecoratedKey.java:38)
  at java.util.TreeMap.getEntry(TreeMap.java:345)
  at java.util.TreeMap.containsKey(TreeMap.java:226)
  at java.util.TreeSet.contains(TreeSet.java:234)
  at 
  org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:396)
  at 
  org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:187)
  at 
  org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:225)
  at 
  java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at 
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at 
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:636)
  
  Here is the definition of the related table CellMessage:
  
  CREATE TABLE CellMessage (
  id text PRIMARY KEY,
  type text,
  version text,
  content text,
  title text,
  generated text,
  date text
  ) WITH
  comment='' AND
  comparator=text AND
  read_repair_chance=0.10 AND
  gc_grace_seconds=864000 AND
  default_validation=text AND
  min_compaction_threshold=4 AND
  max_compaction_threshold=32 AND
  replicate_on_write='true' AND
  compaction_strategy_class='SizeTieredCompactionStrategy' AND
  compression_parameters:sstable_compression='SnappyCompressor';
  
  CREATE INDEX cellmessage_version ON CellMessage (version);
  
  Such errors occur for most tables I defined...
  
  Thanks very 

Re: Distinct Counter Proposal for Cassandra

2012-06-13 Thread Yuki Morishita
You can open JIRA ticket at https://issues.apache.org/jira/browse/CASSANDRA 
with your proposal.

Just for the input:

I had once implemented HyperLogLog counter to use internally in Cassandra, but 
it turned out I didn't need it so I just put it to gist. You can find it here: 
https://gist.github.com/2597943

The above implementation and most of the other ones (including stream-lib) 
implement the optimized version of the algorithm which counts up to 10^9, so 
may need some work.

Other alternative is self-learning bitmap 
(http://ect.bell-labs.com/who/aychen/sbitmap4p.pdf) which, in my understanding, 
is more memory efficient when counting small values.

Yuki


On Wednesday, June 13, 2012 at 11:28 AM, Utku Can Topçu wrote:

 Hi All,
  
 Let's assume we have a use case where we need to count the number of columns 
 for a given key. Let's say the key is the URL and the column-name is the IP 
 address or any cardinality identifier.
  
 The straight forward implementation seems to be simple, just inserting the IP 
 Adresses as columns under the key defined by the URL and using get_count to 
 count them back. However the problem here is in case of large rows (where too 
 many IP addresses are in); the get_count method has to de-serialize the whole 
 row and calculate the count. As also defined in the user guides, it's not an 
 O(1) operation and it's quite costly.
  
 However, this problem seems to have better solutions if you don't have a 
 strict requirement for the count to be exact. There are streaming algorithms 
 that will provide good cardinality estimations within a predefined failure 
 rate, I think the most popular one seems to be the (Hyper)LogLog algorithm, 
 also there's an optimal one developed recently, please check 
 http://dl.acm.org/citation.cfm?doid=1807085.1807094
  
 If you want to take a look at the Java implementation for LogLog, Clearspring 
 has both LogLog and space optimized HyperLogLog available at 
 https://github.com/clearspring/stream-lib
  
 I don't see a reason why this can't be implemented in Cassandra. The 
 distributed nature of all these algorithms can easily be adapted to 
 Cassandra's model. I think most of us would love to see come cardinality 
 estimating columns in Cassandra.
  
 Regards,
 Utku



Re: supercolumns with TTL columns not being compacted correctly

2012-05-22 Thread Yuki Morishita
Data will not be deleted when those keys appear in other stables outside of 
compaction. This is to prevent obsolete data from appearing again.

yuki


On Tuesday, May 22, 2012 at 7:37 AM, Pieter Callewaert wrote:

  
 Hi Samal,
  
  
   
  
  
 Thanks for your time looking into this.
  
  
   
  
  
 I force the compaction by using forceUserDefinedCompaction on only that 
 particular sstable. This gurantees me the new sstable being written only 
 contains the data from the old sstable.
  
  
 The data in the sstable is more than 31 days old and gc_grace is 0, but still 
 the data from the sstable is being written to the new one, while I am 100% 
 sure all the data is invalid.
  
  
   
  
  
 Kind regards,
  
  
 Pieter Callewaert
  
  
   
  
  
 From: samal [mailto:samalgo...@gmail.com]  
 Sent: dinsdag 22 mei 2012 14:33
 To: user@cassandra.apache.org (mailto:user@cassandra.apache.org)
 Subject: Re: supercolumns with TTL columns not being compacted correctly
  
  
   
  
 Data will remain till next compaction but won't be available. Compaction will 
 delete old sstable create new one.
  
 On 22-May-2012 5:47 PM, Pieter Callewaert pieter.callewa...@be-mobile.be 
 (mailto:pieter.callewa...@be-mobile.be) wrote:
  
  
 Hi,
  
  
   
  
  
 I’ve had my suspicions some months, but I think I am sure about it.
  
  
 Data is being written by the SSTableSimpleUnsortedWriter and loaded by the 
 sstableloader.
  
  
 The data should be alive for 31 days, so I use the following logic:
  
  
   
  
  
 int ttl = 2678400;
  
  
 long timestamp = System.currentTimeMillis() * 1000;
  
  
 long expirationTimestampMS = (long) ((timestamp / 1000) + ((long) ttl * 
 1000));
  
  
   
  
  
 And using this to write it:
  
  
   
  
  
 sstableWriter.newRow(bytes(entry.id (http://entry.id)));
  
  
 sstableWriter.newSuperColumn(bytes(superColumn));
  
  
 sstableWriter.addExpiringColumn(nameTT, bytes(entry.aggregatedTTMs), 
 timestamp, ttl, expirationTimestampMS);
  
  
 sstableWriter.addExpiringColumn(nameCov, bytes(entry.observationCoverage), 
 timestamp, ttl, expirationTimestampMS);
  
  
 sstableWriter.addExpiringColumn(nameSpd, bytes(entry.speed), timestamp, ttl, 
 expirationTimestampMS);
  
  
   
  
  
 This works perfectly, data can be queried until 31 days are passed, then no 
 results are given, as expected.
  
  
 But the data is still on disk until the sstables are being recompacted:
  
  
   
  
  
 One of our nodes (we got 6 total) has the following sstables:
  
  
 [cassandra@bemobile-cass3 ~]$ ls -hal /data/MapData007/HOS-* | grep G
  
  
 -rw-rw-r--. 1 cassandra cassandra 103G May  3 03:19 
 /data/MapData007/HOS-hc-125620-Data.db
  
  
 -rw-rw-r--. 1 cassandra cassandra 103G May 12 21:17 
 /data/MapData007/HOS-hc-163141-Data.db
  
  
 -rw-rw-r--. 1 cassandra cassandra  25G May 15 06:17 
 /data/MapData007/HOS-hc-172106-Data.db
  
  
 -rw-rw-r--. 1 cassandra cassandra  25G May 17 19:50 
 /data/MapData007/HOS-hc-181902-Data.db
  
  
 -rw-rw-r--. 1 cassandra cassandra  21G May 21 07:37 
 /data/MapData007/HOS-hc-191448-Data.db
  
  
 -rw-rw-r--. 1 cassandra cassandra 6.5G May 21 17:41 
 /data/MapData007/HOS-hc-193842-Data.db
  
  
 -rw-rw-r--. 1 cassandra cassandra 5.8G May 22 11:03 
 /data/MapData007/HOS-hc-196210-Data.db
  
  
 -rw-rw-r--. 1 cassandra cassandra 1.4G May 22 13:20 
 /data/MapData007/HOS-hc-196779-Data.db
  
  
 -rw-rw-r--. 1 cassandra cassandra 401G Apr 16 08:33 
 /data/MapData007/HOS-hc-58572-Data.db
  
  
 -rw-rw-r--. 1 cassandra cassandra 169G Apr 16 17:59 
 /data/MapData007/HOS-hc-61630-Data.db
  
  
 -rw-rw-r--. 1 cassandra cassandra 173G Apr 17 03:46 
 /data/MapData007/HOS-hc-63857-Data.db
  
  
 -rw-rw-r--. 1 cassandra cassandra 105G Apr 23 06:41 
 /data/MapData007/HOS-hc-87900-Data.db
  
  
   
  
  
 As you can see, the following files should be invalid:
  
  
 /data/MapData007/HOS-hc-58572-Data.db
  
  
 /data/MapData007/HOS-hc-61630-Data.db
  
  
 /data/MapData007/HOS-hc-63857-Data.db
  
  
   
  
  
 Because they are all written more than an moth ago. gc_grace is 0 so this 
 should also not be a problem.
  
  
   
  
  
 As a test, I use forceUserSpecifiedCompaction on the HOS-hc-61630-Data.db.
  
  
 Expected behavior should be an empty file is being written because all data 
 in the sstable should be invalid:
  
  
   
  
  
 Compactionstats is giving:
  
  
 compaction typekeyspace   column family bytes compacted bytes 
 total  progress
  
  
Compaction  MapData007 HOS 11518215662
 532355279724 2.16%
  
  
   
  
  
 And when I ls the directory I find this:
  
  
 -rw-rw-r--. 1 cassandra cassandra 3.9G May 22 14:12 
 /data/MapData007/HOS-tmp-hc-196898-Data.db
  
  
   
  
  
 The sstable is being 1-on-1 copied to a new one. What am I missing here?
  
  
 TTL works perfectly, but is it giving a problem because it is in a super 
 column, and so never to be deleted from disk?
  
  
   
  
  
 Kind regards
  
  
 Pieter Callewaert 

Re: Streaming sessions from BulkOutputFormat job being listed long after they were killed

2012-02-17 Thread Yuki Morishita
Erik,

Currently, streaming failure handling is poorly functioning. 
There are several discussions and bug reports regarding streaming failure on 
jira.

Hanged streaming session will be left in memory unless you restart C*, but it 
does not cause problem I believe. 

-- 
Yuki Morishita


On Friday, February 17, 2012 at 6:18 AM, Erik Forsberg wrote:

 Hi!
 
 If I run a hadoop job that uses BulkOutputFormat to write data to 
 Cassandra, and that hadoop job is aborted, i.e. streaming sessions are 
 not completed, it seems like the streaming sessions hang around for a 
 very long time, I've observed at least 12-15h, in output from 'nodetool 
 netstats'.
 
 To me it seems like they go away only after a restart of Cassandra.
 
 Is this a known behaviour? Does it cause any problems, f. ex. consuming 
 memory, or should I just ignore it?
 
 Regards,
 \EF
 
 




Re: Integration Error between Cassandra and Eclipse

2012-01-05 Thread Yuki Morishita
Also note that Cassandra project switched to git from svn.
See Source control section of http://cassandra.apache.org/download/ .

Regards,

Yuki 

-- 
Yuki Morishita


On Thursday, January 5, 2012 at 7:59 PM, Maki Watanabe wrote:

 Sorry, ignore my reply.
 I had same result with import. ( 1 error in unit test code  many warnings )
 
 2012/1/6 Maki Watanabe watanabe.m...@gmail.com 
 (mailto:watanabe.m...@gmail.com):
  How about to use File-Import... rather than File-New Java Project?
  
  After extracting the source, ant build, and ant generate-eclipse-files:
  1. File-Import...
  2. Choose Existing Project into workspace...
  3. Choose your source directory as root directory and then push Finish
  
  
  2012/1/6 bobby saputra zaibat...@gmail.com (mailto:zaibat...@gmail.com):
   Hi There,
   
   I am a beginner user in Cassandra. I hear from many people said Cassandra 
   is
   a powerful database software which is used by Facebook, Twitter, Digg, 
   etc.
   So I feel interesting to study more about Cassandra.
   
   When I performed integration process between Cassandra with Eclipse IDE 
   (in
   this case I use Java as computer language), I get trouble and have many
   problem.
   I have already followed all instruction from
   http://wiki.apache.org/cassandra/RunningCassandraInEclipse, but this
   tutorial was not working properly. I got a lot of errors and warnings 
   while
   creating Java project in eclipse.
   
   These are the errors and warnings:
   
   Error(X) (1 item):
   Description Resource  Location
   The method rangeSet(RangeT...) in the type Range is not applicable for 
   the
   arguments (Range[]) RangeTest.java line 178
   
   Warnings(!) (100 of 2916 items):
   Description Resource Location
   AbstractType is a raw type. References to generic type AbstractTypeT
   should be parameterized AbstractColumnContainer.java line 72
   (and many same warnings)
   
   These are what i've done:
   1. I checked out cassandra-trunk from given link using SlikSvn as svn
   client.
   2. I moved to cassandra-trunk folder, and build with ant using ant build
   command.
   3. I generate eclipse files with ant using ant generate-eclipse-files
   command.
   4. I create new java project on eclipse, insert project name with
   cassandra-trunk, browse the location into cassandra-trunk folder.
   
   Do I perform any mistakes? Or there are something wrong with the tutorial 
   in
   http://wiki.apache.org/cassandra/RunningCassandraInEclipse ??
   
   I have already googling to find the solution to solve this problem, but
   unfortunately
   I found no results. Would you want to help me by giving me a guide how to
   solve
   this problem? Please
   
   Thank you very much for your help.
   
   Best Regards,
   Wira Saputra
   
  
  
  
  
  --
  w3m
  
 
 
 
 
 -- 
 w3m
 
 




Cassandra Conference in Tokyo, Oct 5

2011-09-20 Thread Yuki Morishita
Greetings,

I'd like to announce Cassandra Conference in Tokyo on October 5th.
The conference mainly focuses on real world usage of Apache Cassandra.

The talks includes:
- How cassandra is used behind the location sharing mobile application
- Amazon S3 like cloud storage backed by Cassandra
etc.

Jonathan Ellis, Apache Cassandra project chair and DataStax CTO, will
be giving keynote speech.

For more detail, please follow the link bellow: (the original site is
Japanese only.)

  http://ec-cube.ec-orange.jp/lp/cassandra-conference-in-tokyo/

or google translated version:

  
http://translate.google.com/translate?sl=autotl=enjs=nprev=_thl=jaie=UTF-8layout=2eotf=1u=http%3A%2F%2Fec-cube.ec-orange.jp%2Flp%2Fcassandra-conference-in-tokyo%2Fact=url

If you're around Tokyo and willing to attend, but not fluent in
Japanese, please let me know.
I'm willing to offer help.

-- 
Yuki Morishita


Re: Limit what nodes are writeable

2011-07-11 Thread Yuki Morishita
I never used the feature, but there is the way to control access based
on user name.
Configuring both conf/passwd.properties and conf/access.properties, then
modify cassandra.yaml as follows.

# authentication backend, implementing IAuthenticator; used to identify users
authenticator: org.apache.cassandra.auth.SimpleAuthenticator

# authorization backend, implementing IAuthority; used to limit
access/provide permissions
authority: org.apache.cassandra.auth.SimpleAuthority

2011/7/11 Maki Watanabe watanabe.m...@gmail.com:
 Cassandra has authentication interface, but doesn't have authorization.
 So you need to implement authorization in your application layer.

 maki


 2011/7/11 David McNelis dmcne...@agentisenergy.com:
 I've been looking in the documentation and haven't found anything about
 this...  but is there support for making a node  read-only?
 For example, you have a cluster set up in two different data centers / racks
 / whatever, with your replication strategy set up so that the data is
 redundant between the two places.  In one of the places all of the incoming
 data will be  processed and inserted into your cluster.  In the other data
 center you plan to allow people to run analytics, but you want to restrict
 the permissions so that the people running analytics can connect to
 Cassandra in whatever way makes the most sense for them, but you don't want
 those people to be able to edit/update data.
 Is it currently possible to configure your cluster in this manner?  Or would
 it only be possible through a third-party solution like wrapping one of the
 access libraries in a way that does not support write operations.

 --
 David McNelis
 Lead Software Engineer
 Agentis Energy
 www.agentisenergy.com
 o: 630.359.6395
 c: 219.384.5143
 A Smart Grid technology company focused on helping consumers of energy
 control an often under-managed resource.





 --
 w3m




-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Support for IN clause

2011-05-19 Thread Yuki Morishita
Hi,

I think IN clause for SELECT and UPDATE will be supported in v0.8.1.
See https://issues.apache.org/jira/browse/CASSANDRA-2553

2011/5/19 Vivek Mishra vivek.mis...@impetus.co.in:
 Does CQL support IN clause?







 
 Write to us for a Free Gold Pass to the Cloud Computing Expo, NYC to attend
 a live session by Head of Impetus Labs on ‘Secrets of Building a Cloud
 Vendor Agnostic PetaByte Scale Real-time Secure Web Application on the Cloud
 ‘.

 Looking to leverage the Cloud for your Big Data Strategy ? Attend Impetus
 webinar on May 27 by registering at
 http://www.impetus.com/webinar?eventid=42 .


 NOTE: This message may contain information that is confidential,
 proprietary, privileged or otherwise protected by law. The message is
 intended solely for the named addressee. If received in error, please
 destroy and notify the sender. Any use of this email is prohibited when
 received in error. Impetus does not represent, warrant and/or guarantee,
 that the integrity of this communication has been maintained nor that the
 communication is free of errors, virus, interception or interference.




-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Read operation with CL.ALL, not yet supported?

2010-06-02 Thread Yuki Morishita
Gary,

Thanks for reply. I've opened an issue at

https://issues.apache.org/jira/browse/CASSANDRA-1152

Yuki

2010/6/3 Gary Dusbabek gdusba...@gmail.com:
 Yuki,

 Can you file a jira ticket for this
 (https://issues.apache.org/jira/browse/CASSANDRA)?  The wiki indicates
 that this should be allowed:  http://wiki.apache.org/cassandra/API

 Regards,

 Gary.


 On Tue, Jun 1, 2010 at 21:50, Yuki Morishita mor.y...@gmail.com wrote:
 Hi,

 I'm testing several read operations(get, get_slice, get_count, etc.) with
 various ConsistencyLevel and noticed that ConsistencyLevel.ALL is
 not yet supported in most of read ops (other than get_range_slice).

 I've looked up code in StorageProxy#readProtocol and it seems
 to be able to handle CL.ALL, but in thrift.CassandraServer#readColumnFamily,
 there is code that just throws exception when consistency_level == ALL.
 Is there any reason that CL.ALL is not yet supported?

 
 Yuki Morishita
  t:yukim (http://twitter.com/yukim)





-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Read operation with CL.ALL, not yet supported?

2010-06-01 Thread Yuki Morishita
Hi,

I'm testing several read operations(get, get_slice, get_count, etc.) with
various ConsistencyLevel and noticed that ConsistencyLevel.ALL is
not yet supported in most of read ops (other than get_range_slice).

I've looked up code in StorageProxy#readProtocol and it seems
to be able to handle CL.ALL, but in thrift.CassandraServer#readColumnFamily,
there is code that just throws exception when consistency_level == ALL.
Is there any reason that CL.ALL is not yet supported?


Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Problem accessing Cassandra wiki top page with browser locale other than english

2010-05-25 Thread Yuki Morishita
Jonathan,

Thanks for reporting an issue.
I will wait and see.

2010年5月25日23:29 Jonathan Ellis jbel...@gmail.com:
 Turns out this is a bug in the version of MoinMoin the ASF has
 installed.  There's nothing we can do until the infrastructure team
 upgrades: https://issues.apache.org/jira/browse/INFRA-2741

 On Sun, May 23, 2010 at 10:09 PM, Yuki Morishita mor.y...@gmail.com wrote:
 Hi all,

 I'm currently working on translating cassandra wiki to Japanese.
 Cassandra is gaining attention in Japan, too. :)

 I noticed that for those who have browser locale with 'ja', accessing
 top page of cassandra wiki (http://wiki.apache.org/cassandra) displays
 Japanese default front page
 (http://wiki.apache.org/cassandra/フロントページ), not the one wanted
 (http://wiki.apache.org/cassandra/FrontPage).

 Since the front page for Japanese locale is not editable, I cannot
 make any change to it.
 (FrontPage is translated into Japanese, but with the name FrontPage_JP.)

 Can I get privilege to edit Japanese front page above?
 Or, can someone from dev team edit above front page so that everyone
 with browser locale 'ja' get redirected to 'FrontPage_JP'?
 (Just put '#redirect FrontPage_JP' in first line of
 http://wiki.apache.org/cassandra/フロントページ)

 Thanks in advance,

 
 Yuki Morishita
 t:yukim (http://twitter.com/yukim)




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com




-- 

Yuki Morishita