Re: Released an ACID-compliant transaction library on top of Cassandra

2018-10-16 Thread Hiroyuki Yamada
Hi all,

Thank you for the comments and feedbacks.

As Jonathan pointed out, it relies on LWT and uses the protocol
proposed in the paper.
Please read the design document for more detail.
https://github.com/scalar-labs/scalardb/blob/master/docs/design.md

Regarding the licensing, we are thinking of releasing it with Apache 2
if lots of developers are interested in it.

Best regards,
Hiroyuki
On Wed, Oct 17, 2018 at 3:13 AM Jonathan Ellis  wrote:
>
> Which was followed up by 
> https://www.researchgate.net/profile/Akon_Dey/publication/282156834_Scalable_Distributed_Transactions_across_Heterogeneous_Stores/links/56058b9608ae5e8e3f32b98d.pdf
>
> On Tue, Oct 16, 2018 at 1:02 PM Jonathan Ellis  wrote:
>>
>> It looks like it's based on this: 
>> http://www.vldb.org/pvldb/vol6/p1434-dey.pdf
>>
>> On Tue, Oct 16, 2018 at 11:37 AM Ariel Weisberg  wrote:
>>>
>>> Hi,
>>>
>>> Yes this does sound great. Does this rely on Cassandra's internal SERIAL 
>>> consistency and CAS functionality or is that implemented at a higher level?
>>>
>>> Regards,
>>> Ariel
>>>
>>> On Tue, Oct 16, 2018, at 12:31 PM, Jeff Jirsa wrote:
>>> > This is great!
>>> >
>>> > --
>>> > Jeff Jirsa
>>> >
>>> >
>>> > > On Oct 16, 2018, at 5:47 PM, Hiroyuki Yamada  wrote:
>>> > >
>>> > > Hi all,
>>> > >
>>> > > # Sorry, I accidentally emailed the following to dev@, so re-sending to 
>>> > > here.
>>> > >
>>> > > We have been working on ACID-compliant transaction library on top of
>>> > > Cassandra called Scalar DB,
>>> > > and are pleased to announce the release of v.1.0 RC version in open 
>>> > > source.
>>> > >
>>> > > https://github.com/scalar-labs/scalardb/
>>> > >
>>> > > Scalar DB is a library that provides a distributed storage abstraction
>>> > > and client-coordinated distributed transaction on the storage,
>>> > > and makes non-ACID distributed database/storage ACID-compliant.
>>> > > And Cassandra is the first supported database implementation.
>>> > >
>>> > > It's been internally tested intensively and is jepsen-passed.
>>> > > (see jepsen directory for more detail)
>>> > > If you are looking for ACID transaction capability on top of cassandra,
>>> > > Please take a look and give us a feedback or contribution.
>>> > >
>>> > > Best regards,
>>> > > Hiroyuki Yamada
>>> > >
>>> > > -
>>> > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> > > For additional commands, e-mail: user-h...@cassandra.apache.org
>>> > >
>>> >
>>> > -
>>> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> > For additional commands, e-mail: user-h...@cassandra.apache.org
>>> >
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>
>>
>> --
>> Jonathan Ellis
>> co-founder, http://www.datastax.com
>> @spyced
>
>
>
> --
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Metrics matrix: migrate 2.1.x metrics to 2.2.x+

2018-10-16 Thread Carl Mueller
Your dashboards are great. The only challenge is getting all the data to
feed them.


On Tue, Oct 16, 2018 at 1:45 PM Carl Mueller 
wrote:

> metadata.csv: that helps a lot, thank you!
>
> On Fri, Oct 5, 2018 at 5:42 AM Alain RODRIGUEZ  wrote:
>
>> I feel you for most of the troubles you faced, I've been facing most of
>> them too. Again, Datadog support can probably help you with most of those.
>> You should really consider sharing this feedback to them.
>>
>> there is re-namespacing of the metric names in lots of cases, and these
>>> don't appear to be centrally documented, but maybe i haven't found the
>>> magic page.
>>>
>>
>> I don't know if that would be the 'magic' page, but that's something:
>> https://github.com/DataDog/integrations-core/blob/master/cassandra/metadata.csv
>>
>> There are so many good stats.
>>
>>
>> Yes, and it's still improving. I love this about Cassandra. It's our work
>> to pick the relevant ones for each situation. I would not like Cassandra to
>> reduce the number of metrics exposed, we need to learn to handle them
>> properly. Also, this is the reason we designed 4 dashboards out the box,
>> the goal was to have everything we need for distinct scenarios:
>> - Overview - global health-check / anomaly detection
>> - Read Path - troubleshooting / optimizing read ops
>> - Write Path - troubleshooting / optimizing write ops
>> - SSTable Management - troubleshooting / optimizing -
>> comapction/flushes/... anything related to sstables.
>>
>> instead of the single overview dashboard that was present before. We are
>> also perfectly aware that it's far from perfect, but aiming at perfect
>> would only have had us never releasing anything. Anyone interested could
>> now build missing dashboards or improve existing ones for himself or/and
>> suggest improvements to Datadog :). I hope I'll do some more of this work
>> at some point in the future.
>>
>> Good luck,
>> C*heers,
>> ---
>> Alain Rodriguez - @arodream - al...@thelastpickle.com
>> France / Spain
>>
>> The Last Pickle - Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>> Le jeu. 4 oct. 2018 à 21:21, Carl Mueller
>>  a écrit :
>>
>>> for 2.1.x we had a custom reporter that delivered  metrics to datadog's
>>> endpoint via https, bypassing the agent-imposed 350. But integrating that
>>> required targetting the other shared libs in the cassandra path, so the
>>> build is a bit of a pain when we update major versions.
>>>
>>> We are migrating our 2.1.x specific dashboards, and we will use
>>> agent-delivered metrics for non-table, and adapt the custom library to
>>> deliver the table-based ones, at a slower rate than the "core" ones.
>>>
>>> Datadog is also super annoying because there doesn't appear to be
>>> anything that reports what metrics the agent is sending (the metric count
>>> can indicate if a configured new metric increased the count and is being
>>> reported, but it's still... a guess), and there is re-namespacing of the
>>> metric names in lots of cases, and these don't appear to be centrally
>>> documented, but maybe i haven't found the magic page.
>>>
>>> There are so many good stats. We might also implement some facility
>>> to dynamically turn on the delivery of detailed metrics on the nodes.
>>>
>>> On Tue, Oct 2, 2018 at 5:21 AM Alain RODRIGUEZ 
>>> wrote:
>>>
 Hello Carl,

 I guess we can use bean_regex to do specific targetted metrics for the
> important tables anyway.
>

 Yes, this would work, but 350 is very limited for Cassandra dashboards.
 We have a LOT of metrics available.

 Datadog 350 metric limit is a PITA for tables once you get over 10
> tables
>

 I noticed this while I was working on providing default dashboards for
 Cassandra-Datadog integration. I was told by Datadog team it would not be
 an issue for users, that I should not care about it. As you pointed out,
 per table metrics quickly increase the total number of metrics we need to
 collect.

 I believe you can set the following option: *"max_returned_metrics:
 1000"* - it can be used if metrics are missing to increase the limit
 of the number of collected metrics. Be aware of CPU utilization that this
 might imply (greatly improved in dd-agent version 6+ I believe -thanks
 Datadog teams for that- making this fully usable for Cassandra). This
 option should go in the *cassandra.yaml* file for Cassandra
 integrations, off the top of my head.

 Also, do not hesitate to reach to Datadog directly for this kind of
 questions, I have always been very happy with their support so far, I am
 sure they would guide you through this as well, probably better than we can
 do :). It also provides them with feedback on what people are struggling
 with I imagine.

 I am interested to know if you still have issues getting more metrics
 (option above not working / CPU under too 

Re: Metrics matrix: migrate 2.1.x metrics to 2.2.x+

2018-10-16 Thread Carl Mueller
metadata.csv: that helps a lot, thank you!

On Fri, Oct 5, 2018 at 5:42 AM Alain RODRIGUEZ  wrote:

> I feel you for most of the troubles you faced, I've been facing most of
> them too. Again, Datadog support can probably help you with most of those.
> You should really consider sharing this feedback to them.
>
> there is re-namespacing of the metric names in lots of cases, and these
>> don't appear to be centrally documented, but maybe i haven't found the
>> magic page.
>>
>
> I don't know if that would be the 'magic' page, but that's something:
> https://github.com/DataDog/integrations-core/blob/master/cassandra/metadata.csv
>
> There are so many good stats.
>
>
> Yes, and it's still improving. I love this about Cassandra. It's our work
> to pick the relevant ones for each situation. I would not like Cassandra to
> reduce the number of metrics exposed, we need to learn to handle them
> properly. Also, this is the reason we designed 4 dashboards out the box,
> the goal was to have everything we need for distinct scenarios:
> - Overview - global health-check / anomaly detection
> - Read Path - troubleshooting / optimizing read ops
> - Write Path - troubleshooting / optimizing write ops
> - SSTable Management - troubleshooting / optimizing -
> comapction/flushes/... anything related to sstables.
>
> instead of the single overview dashboard that was present before. We are
> also perfectly aware that it's far from perfect, but aiming at perfect
> would only have had us never releasing anything. Anyone interested could
> now build missing dashboards or improve existing ones for himself or/and
> suggest improvements to Datadog :). I hope I'll do some more of this work
> at some point in the future.
>
> Good luck,
> C*heers,
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> Le jeu. 4 oct. 2018 à 21:21, Carl Mueller
>  a écrit :
>
>> for 2.1.x we had a custom reporter that delivered  metrics to datadog's
>> endpoint via https, bypassing the agent-imposed 350. But integrating that
>> required targetting the other shared libs in the cassandra path, so the
>> build is a bit of a pain when we update major versions.
>>
>> We are migrating our 2.1.x specific dashboards, and we will use
>> agent-delivered metrics for non-table, and adapt the custom library to
>> deliver the table-based ones, at a slower rate than the "core" ones.
>>
>> Datadog is also super annoying because there doesn't appear to be
>> anything that reports what metrics the agent is sending (the metric count
>> can indicate if a configured new metric increased the count and is being
>> reported, but it's still... a guess), and there is re-namespacing of the
>> metric names in lots of cases, and these don't appear to be centrally
>> documented, but maybe i haven't found the magic page.
>>
>> There are so many good stats. We might also implement some facility
>> to dynamically turn on the delivery of detailed metrics on the nodes.
>>
>> On Tue, Oct 2, 2018 at 5:21 AM Alain RODRIGUEZ 
>> wrote:
>>
>>> Hello Carl,
>>>
>>> I guess we can use bean_regex to do specific targetted metrics for the
 important tables anyway.

>>>
>>> Yes, this would work, but 350 is very limited for Cassandra dashboards.
>>> We have a LOT of metrics available.
>>>
>>> Datadog 350 metric limit is a PITA for tables once you get over 10 tables

>>>
>>> I noticed this while I was working on providing default dashboards for
>>> Cassandra-Datadog integration. I was told by Datadog team it would not be
>>> an issue for users, that I should not care about it. As you pointed out,
>>> per table metrics quickly increase the total number of metrics we need to
>>> collect.
>>>
>>> I believe you can set the following option: *"max_returned_metrics:
>>> 1000"* - it can be used if metrics are missing to increase the limit of
>>> the number of collected metrics. Be aware of CPU utilization that this
>>> might imply (greatly improved in dd-agent version 6+ I believe -thanks
>>> Datadog teams for that- making this fully usable for Cassandra). This
>>> option should go in the *cassandra.yaml* file for Cassandra
>>> integrations, off the top of my head.
>>>
>>> Also, do not hesitate to reach to Datadog directly for this kind of
>>> questions, I have always been very happy with their support so far, I am
>>> sure they would guide you through this as well, probably better than we can
>>> do :). It also provides them with feedback on what people are struggling
>>> with I imagine.
>>>
>>> I am interested to know if you still have issues getting more metrics
>>> (option above not working / CPU under too much load) as this would make the
>>> dashboards we built mostly unusable for clusters with more tables. We might
>>> then need to review the design.
>>>
>>> As a side note, I believe metrics are handled the same way cross
>>> version, they got the same 

Re: Released an ACID-compliant transaction library on top of Cassandra

2018-10-16 Thread Jonathan Ellis
Which was followed up by
https://www.researchgate.net/profile/Akon_Dey/publication/282156834_Scalable_Distributed_Transactions_across_Heterogeneous_Stores/links/56058b9608ae5e8e3f32b98d.pdf

On Tue, Oct 16, 2018 at 1:02 PM Jonathan Ellis  wrote:

> It looks like it's based on this:
> http://www.vldb.org/pvldb/vol6/p1434-dey.pdf
>
> On Tue, Oct 16, 2018 at 11:37 AM Ariel Weisberg  wrote:
>
>> Hi,
>>
>> Yes this does sound great. Does this rely on Cassandra's internal SERIAL
>> consistency and CAS functionality or is that implemented at a higher level?
>>
>> Regards,
>> Ariel
>>
>> On Tue, Oct 16, 2018, at 12:31 PM, Jeff Jirsa wrote:
>> > This is great!
>> >
>> > --
>> > Jeff Jirsa
>> >
>> >
>> > > On Oct 16, 2018, at 5:47 PM, Hiroyuki Yamada 
>> wrote:
>> > >
>> > > Hi all,
>> > >
>> > > # Sorry, I accidentally emailed the following to dev@, so re-sending
>> to here.
>> > >
>> > > We have been working on ACID-compliant transaction library on top of
>> > > Cassandra called Scalar DB,
>> > > and are pleased to announce the release of v.1.0 RC version in open
>> source.
>> > >
>> > > https://github.com/scalar-labs/scalardb/
>> > >
>> > > Scalar DB is a library that provides a distributed storage abstraction
>> > > and client-coordinated distributed transaction on the storage,
>> > > and makes non-ACID distributed database/storage ACID-compliant.
>> > > And Cassandra is the first supported database implementation.
>> > >
>> > > It's been internally tested intensively and is jepsen-passed.
>> > > (see jepsen directory for more detail)
>> > > If you are looking for ACID transaction capability on top of
>> cassandra,
>> > > Please take a look and give us a feedback or contribution.
>> > >
>> > > Best regards,
>> > > Hiroyuki Yamada
>> > >
>> > > -
>> > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> > > For additional commands, e-mail: user-h...@cassandra.apache.org
>> > >
>> >
>> > -
>> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> > For additional commands, e-mail: user-h...@cassandra.apache.org
>> >
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>
> --
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced
>


-- 
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced


Re: Released an ACID-compliant transaction library on top of Cassandra

2018-10-16 Thread Jonathan Ellis
It looks like it's based on this:
http://www.vldb.org/pvldb/vol6/p1434-dey.pdf

On Tue, Oct 16, 2018 at 11:37 AM Ariel Weisberg  wrote:

> Hi,
>
> Yes this does sound great. Does this rely on Cassandra's internal SERIAL
> consistency and CAS functionality or is that implemented at a higher level?
>
> Regards,
> Ariel
>
> On Tue, Oct 16, 2018, at 12:31 PM, Jeff Jirsa wrote:
> > This is great!
> >
> > --
> > Jeff Jirsa
> >
> >
> > > On Oct 16, 2018, at 5:47 PM, Hiroyuki Yamada 
> wrote:
> > >
> > > Hi all,
> > >
> > > # Sorry, I accidentally emailed the following to dev@, so re-sending
> to here.
> > >
> > > We have been working on ACID-compliant transaction library on top of
> > > Cassandra called Scalar DB,
> > > and are pleased to announce the release of v.1.0 RC version in open
> source.
> > >
> > > https://github.com/scalar-labs/scalardb/
> > >
> > > Scalar DB is a library that provides a distributed storage abstraction
> > > and client-coordinated distributed transaction on the storage,
> > > and makes non-ACID distributed database/storage ACID-compliant.
> > > And Cassandra is the first supported database implementation.
> > >
> > > It's been internally tested intensively and is jepsen-passed.
> > > (see jepsen directory for more detail)
> > > If you are looking for ACID transaction capability on top of cassandra,
> > > Please take a look and give us a feedback or contribution.
> > >
> > > Best regards,
> > > Hiroyuki Yamada
> > >
> > > -
> > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: user-h...@cassandra.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: user-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

-- 
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced


Re: Released an ACID-compliant transaction library on top of Cassandra

2018-10-16 Thread DuyHai Doan
I think it does use LWT under the hood:

https://github.com/scalar-labs/scalardb/blob/master/src/main/java/com/scalar/database/transaction/consensuscommit/CommitMutationComposer.java#L74-L79

return new Put(base.getPartitionKey(), getClusteringKey(base,
result).orElse(null))
.forNamespace(base.forNamespace().get())
.forTable(base.forTable().get())
.withConsistency(Consistency.LINEARIZABLE)
.withCondition(
new PutIf(
new ConditionalExpression(ID, toIdValue(id), Operator.EQ),
new ConditionalExpression(
STATE, toStateValue(TransactionState.PREPARED),
Operator.EQ)))
.withValue(Attribute.toCommittedAtValue(current))
.withValue(Attribute.toStateValue(TransactionState.COMMITTED));



On Tue, Oct 16, 2018 at 6:40 PM sankalp kohli 
wrote:

> What License did you use? Can we please use Apache 2.0?
>
> On Tue, Oct 16, 2018 at 9:39 AM sankalp kohli 
> wrote:
>
>> This is awesome and thanks for working on it.
>>
>> On Tue, Oct 16, 2018 at 9:37 AM Ariel Weisberg  wrote:
>>
>>> Hi,
>>>
>>> Yes this does sound great. Does this rely on Cassandra's internal SERIAL
>>> consistency and CAS functionality or is that implemented at a higher level?
>>>
>>> Regards,
>>> Ariel
>>>
>>> On Tue, Oct 16, 2018, at 12:31 PM, Jeff Jirsa wrote:
>>> > This is great!
>>> >
>>> > --
>>> > Jeff Jirsa
>>> >
>>> >
>>> > > On Oct 16, 2018, at 5:47 PM, Hiroyuki Yamada 
>>> wrote:
>>> > >
>>> > > Hi all,
>>> > >
>>> > > # Sorry, I accidentally emailed the following to dev@, so
>>> re-sending to here.
>>> > >
>>> > > We have been working on ACID-compliant transaction library on top of
>>> > > Cassandra called Scalar DB,
>>> > > and are pleased to announce the release of v.1.0 RC version in open
>>> source.
>>> > >
>>> > > https://github.com/scalar-labs/scalardb/
>>> > >
>>> > > Scalar DB is a library that provides a distributed storage
>>> abstraction
>>> > > and client-coordinated distributed transaction on the storage,
>>> > > and makes non-ACID distributed database/storage ACID-compliant.
>>> > > And Cassandra is the first supported database implementation.
>>> > >
>>> > > It's been internally tested intensively and is jepsen-passed.
>>> > > (see jepsen directory for more detail)
>>> > > If you are looking for ACID transaction capability on top of
>>> cassandra,
>>> > > Please take a look and give us a feedback or contribution.
>>> > >
>>> > > Best regards,
>>> > > Hiroyuki Yamada
>>> > >
>>> > > -
>>> > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> > > For additional commands, e-mail: user-h...@cassandra.apache.org
>>> > >
>>> >
>>> > -
>>> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> > For additional commands, e-mail: user-h...@cassandra.apache.org
>>> >
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>


Re: Released an ACID-compliant transaction library on top of Cassandra

2018-10-16 Thread sankalp kohli
What License did you use? Can we please use Apache 2.0?

On Tue, Oct 16, 2018 at 9:39 AM sankalp kohli 
wrote:

> This is awesome and thanks for working on it.
>
> On Tue, Oct 16, 2018 at 9:37 AM Ariel Weisberg  wrote:
>
>> Hi,
>>
>> Yes this does sound great. Does this rely on Cassandra's internal SERIAL
>> consistency and CAS functionality or is that implemented at a higher level?
>>
>> Regards,
>> Ariel
>>
>> On Tue, Oct 16, 2018, at 12:31 PM, Jeff Jirsa wrote:
>> > This is great!
>> >
>> > --
>> > Jeff Jirsa
>> >
>> >
>> > > On Oct 16, 2018, at 5:47 PM, Hiroyuki Yamada 
>> wrote:
>> > >
>> > > Hi all,
>> > >
>> > > # Sorry, I accidentally emailed the following to dev@, so re-sending
>> to here.
>> > >
>> > > We have been working on ACID-compliant transaction library on top of
>> > > Cassandra called Scalar DB,
>> > > and are pleased to announce the release of v.1.0 RC version in open
>> source.
>> > >
>> > > https://github.com/scalar-labs/scalardb/
>> > >
>> > > Scalar DB is a library that provides a distributed storage abstraction
>> > > and client-coordinated distributed transaction on the storage,
>> > > and makes non-ACID distributed database/storage ACID-compliant.
>> > > And Cassandra is the first supported database implementation.
>> > >
>> > > It's been internally tested intensively and is jepsen-passed.
>> > > (see jepsen directory for more detail)
>> > > If you are looking for ACID transaction capability on top of
>> cassandra,
>> > > Please take a look and give us a feedback or contribution.
>> > >
>> > > Best regards,
>> > > Hiroyuki Yamada
>> > >
>> > > -
>> > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> > > For additional commands, e-mail: user-h...@cassandra.apache.org
>> > >
>> >
>> > -
>> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> > For additional commands, e-mail: user-h...@cassandra.apache.org
>> >
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>


Re: Released an ACID-compliant transaction library on top of Cassandra

2018-10-16 Thread sankalp kohli
This is awesome and thanks for working on it.

On Tue, Oct 16, 2018 at 9:37 AM Ariel Weisberg  wrote:

> Hi,
>
> Yes this does sound great. Does this rely on Cassandra's internal SERIAL
> consistency and CAS functionality or is that implemented at a higher level?
>
> Regards,
> Ariel
>
> On Tue, Oct 16, 2018, at 12:31 PM, Jeff Jirsa wrote:
> > This is great!
> >
> > --
> > Jeff Jirsa
> >
> >
> > > On Oct 16, 2018, at 5:47 PM, Hiroyuki Yamada 
> wrote:
> > >
> > > Hi all,
> > >
> > > # Sorry, I accidentally emailed the following to dev@, so re-sending
> to here.
> > >
> > > We have been working on ACID-compliant transaction library on top of
> > > Cassandra called Scalar DB,
> > > and are pleased to announce the release of v.1.0 RC version in open
> source.
> > >
> > > https://github.com/scalar-labs/scalardb/
> > >
> > > Scalar DB is a library that provides a distributed storage abstraction
> > > and client-coordinated distributed transaction on the storage,
> > > and makes non-ACID distributed database/storage ACID-compliant.
> > > And Cassandra is the first supported database implementation.
> > >
> > > It's been internally tested intensively and is jepsen-passed.
> > > (see jepsen directory for more detail)
> > > If you are looking for ACID transaction capability on top of cassandra,
> > > Please take a look and give us a feedback or contribution.
> > >
> > > Best regards,
> > > Hiroyuki Yamada
> > >
> > > -
> > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: user-h...@cassandra.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: user-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Released an ACID-compliant transaction library on top of Cassandra

2018-10-16 Thread Ariel Weisberg
Hi,

Yes this does sound great. Does this rely on Cassandra's internal SERIAL 
consistency and CAS functionality or is that implemented at a higher level? 

Regards,
Ariel

On Tue, Oct 16, 2018, at 12:31 PM, Jeff Jirsa wrote:
> This is great!
> 
> -- 
> Jeff Jirsa
> 
> 
> > On Oct 16, 2018, at 5:47 PM, Hiroyuki Yamada  wrote:
> > 
> > Hi all,
> > 
> > # Sorry, I accidentally emailed the following to dev@, so re-sending to 
> > here.
> > 
> > We have been working on ACID-compliant transaction library on top of
> > Cassandra called Scalar DB,
> > and are pleased to announce the release of v.1.0 RC version in open source.
> > 
> > https://github.com/scalar-labs/scalardb/
> > 
> > Scalar DB is a library that provides a distributed storage abstraction
> > and client-coordinated distributed transaction on the storage,
> > and makes non-ACID distributed database/storage ACID-compliant.
> > And Cassandra is the first supported database implementation.
> > 
> > It's been internally tested intensively and is jepsen-passed.
> > (see jepsen directory for more detail)
> > If you are looking for ACID transaction capability on top of cassandra,
> > Please take a look and give us a feedback or contribution.
> > 
> > Best regards,
> > Hiroyuki Yamada
> > 
> > -
> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: user-h...@cassandra.apache.org
> > 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Released an ACID-compliant transaction library on top of Cassandra

2018-10-16 Thread Jeff Jirsa
This is great!

-- 
Jeff Jirsa


> On Oct 16, 2018, at 5:47 PM, Hiroyuki Yamada  wrote:
> 
> Hi all,
> 
> # Sorry, I accidentally emailed the following to dev@, so re-sending to here.
> 
> We have been working on ACID-compliant transaction library on top of
> Cassandra called Scalar DB,
> and are pleased to announce the release of v.1.0 RC version in open source.
> 
> https://github.com/scalar-labs/scalardb/
> 
> Scalar DB is a library that provides a distributed storage abstraction
> and client-coordinated distributed transaction on the storage,
> and makes non-ACID distributed database/storage ACID-compliant.
> And Cassandra is the first supported database implementation.
> 
> It's been internally tested intensively and is jepsen-passed.
> (see jepsen directory for more detail)
> If you are looking for ACID transaction capability on top of cassandra,
> Please take a look and give us a feedback or contribution.
> 
> Best regards,
> Hiroyuki Yamada
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Released an ACID-compliant transaction library on top of Cassandra

2018-10-16 Thread Hiroyuki Yamada
Hi all,

# Sorry, I accidentally emailed the following to dev@, so re-sending to here.

We have been working on ACID-compliant transaction library on top of
Cassandra called Scalar DB,
and are pleased to announce the release of v.1.0 RC version in open source.

https://github.com/scalar-labs/scalardb/

Scalar DB is a library that provides a distributed storage abstraction
and client-coordinated distributed transaction on the storage,
and makes non-ACID distributed database/storage ACID-compliant.
And Cassandra is the first supported database implementation.

It's been internally tested intensively and is jepsen-passed.
(see jepsen directory for more detail)
If you are looking for ACID transaction capability on top of cassandra,
Please take a look and give us a feedback or contribution.

Best regards,
Hiroyuki Yamada

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



TWCS: Repair create new buckets with old data

2018-10-16 Thread Caesar, Maik
Hallo,
we work with Cassandra version 3.0.9 and have a problem in a table with TWCS. 
The command "nodetool repair" create always new files with old data. This avoid 
the delete of the old data.
The layout of the Table is following:
cqlsh> desc stat.spa

CREATE TABLE stat.spa (
region int,
id int,
date text,
hour int,
zippedjsonstring blob,
PRIMARY KEY ((region, id), date, hour)
) WITH CLUSTERING ORDER BY (date ASC, hour ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 
'compaction_window_size': '1', 'compaction_window_unit': 'DAYS', 
'max_threshold': '100', 'min_threshold': '4', 'tombstone_compaction_interval': 
'86460'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

Actual the oldest data are from 2017/04/15 and will not remove:

$ for f in *Data.db; do meta=$(sudo sstablemetadata $f); echo -e "Max:" $(date 
--date=@$(echo "$meta" | grep Maximum\ time | cut -d" "  -f3| cut -c 1-10) 
'+%Y/%m/%d %H:%M') "Min:" $(date --date=@$(echo "$meta" | grep Minimum\ time | 
cut -d" "  -f3| cut -c 1-10) '+%Y/%m/%d %H:%M') $(echo "$meta" | grep 
droppable) $(echo "$meta" | grep "Repaired at") ' \t ' $(ls -lh $f | awk 
'{print $5" "$6" "$7" "$8" "$9}'); done | sort
Max: 2017/04/15 12:08 Min: 2017/03/31 13:09 Estimated droppable tombstones: 
1.7731048805815162 Repaired at: 1525685601400 42K May 7 19:56 
mc-22922-big-Data.db
Max: 2017/04/17 13:49 Min: 2017/03/31 13:09 Estimated droppable tombstones: 
1.9600207684319835 Repaired at: 1525685601400 116M May 7 13:31 
mc-15096-big-Data.db
Max: 2017/04/21 13:43 Min: 2017/04/15 13:34 Estimated droppable tombstones: 
1.9090909090909092 Repaired at: 1525685601400 11K May 7 19:56 
mc-22921-big-Data.db
Max: 2017/05/23 21:45 Min: 2017/04/21 14:00 Estimated droppable tombstones: 
1.8360655737704918 Repaired at: 1525685601400 21M May 7 19:56 
mc-22919-big-Data.db
Max: 2017/06/12 15:19 Min: 2017/04/25 14:45 Estimated droppable tombstones: 
1.8091397849462365 Repaired at: 1525685601400 19M May 7 14:36 
mc-17095-big-Data.db
Max: 2017/06/15 15:26 Min: 2017/05/10 14:37 Estimated droppable tombstones: 
1.76536312849162 Repaired at: 1529612605539   9.3M Jun 21 22:31 
mc-25372-big-Data.db
...

After a "nodetool repair" run, a new big data file is created that include old 
data from 2017/07/31.

Max: 2018/07/27 18:10 Min: 2017/03/31 13:13 Estimated droppable tombstones: 
0.08392555471691247 Repaired at: 011G Sep 11 22:02 
mc-39281-big-Data.db
...
Max: 2018/08/16 18:18 Min: 2018/08/06 12:19 Estimated droppable tombstones: 0.0 
Repaired at: 1534525730510123M Aug 17 23:46 mc-36847-big-Data.db
Max: 2018/08/17 19:20 Min: 2017/07/31 12:04 Estimated droppable tombstones: 
0.03385963490004347 Repaired at: 011G Sep 11 21:43 
mc-39265-big-Data.db
Max: 2018/08/17 19:20 Min: 2018/07/24 12:33 Estimated droppable tombstones: 0.0 
Repaired at: 1534525730510135M Sep 11 21:44 mc-39270-big-Data.db
...
Max: 2018/09/06 17:30 Min: 2018/08/28 12:17 Estimated droppable tombstones: 0.0 
Repaired at: 1536690786879129M Sep 11 21:10 mc-39238-big-Data.db
Max: 2018/09/07 18:22 Min: 2017/04/23 12:48 Estimated droppable tombstones: 
0.1548442441468401 Repaired at: 0 8.0G Sep 11 21:33 mc-39258-big-Data.db
Max: 2018/09/07 18:22 Min: 2018/09/07 12:15 Estimated droppable tombstones: 0.0 
Repaired at: 153669078687972M Sep 11 21:34 mc-39262-big-Data.db
Max: 2018/09/08 18:20 Min: 2018/08/22 12:17 Estimated droppable tombstones: 0.0 
Repaired at: 02.8G Sep 11 21:47 mc-39272-big-Data.db

The tool sstableexpiredblockers shows that the file mc-39281-big-Data.db blocks 
95 expired files from getting dropped, for example the oldest file 
mc-22922-big-Data.db

[BigTableReader(path='.../stat/spa-.../mc-39281-big-Data.db') (minTS = 
149095878253, maxTS = 1532707837676719, maxLDT = 1557154990)
  blocks 95 expired sstables from getting dropped:
 [BigTableReader(path='.../stat/spa-.../mc-36936-big-Data.db') (minTS = 
1500027128958000, maxTS = 1503666765807229, maxLDT = 1535202765)
[BigTableReader(path='.../stat/spa-.../mc-22921-big-Data.db') (minTS = 
1492256093314000, maxTS = 1492775013454001, maxLDT = 1524311013)
[BigTableReader(path='.../stat/spa-.../mc-36947-big-Data.db') (minTS = 
1492255708403000, maxTS = 1501937182477001, maxLDT = 1533473182)
[BigTableReader(path='.../stat/spa-.../mc-32582-big-Data.db') (minTS = 
1493028031639000,