[GitHub] cassandra pull request #88: added encoding param to non-debug printmsg funct...

2017-03-01 Thread sparkida
Github user sparkida closed the pull request at:

https://github.com/apache/cassandra/pull/88


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Lots of dtest errors on local machine

2017-03-01 Thread benjamin roth
Hi again,

I wanted to run some dtests (e.g. from materialized_views_test.py) to check
my changes. A while ago, everything worked fine but today I ran into a lot
of errors like this:
https://gist.github.com/brstgt/114d76769d97dc72059f9252330c4142

This happened on 2 different machines (macos + linux).
CS Version: 4.0 (trunk of today)
CCM: 2.6.0

I deleted and reinstalled all python deps.
I also checked the ccm logs (as long as I was able to access them, because
dtests delete them after the test)
In the attached log, 127.0.0.2 caused. In ccm logs I saw, that the node was
stopped by the test, started again and seemed to boot up again correctly.

Running some of these tests against 3.11 worked.
Switching back to trunk/4.0 - ERROR

Is this a known issue - maybe caused by removed RPC support? Am I maybe
doing sth wrong?


Need feedback on CASSANDRA-13066

2017-03-01 Thread benjamin roth
Hi guys,

I started working on 13066. My intention is to offer a table-setting that
allows a operator to optimize MV streaming in some cases or simply "on
purpose - i know what i do".

MV write path streaming can be ommitted e.g. if:
- data is append only
- no PK is added to MV so no stale data can be created on race conditions

This is a first patch:
https://github.com/Jaumo/cassandra/commit/0d4ce966f129e1b29098f194b5951a86dc8c585a

Please don't consider it as final. Some tests are missing and some logic is
still missing.

When introducing a table option what would be to prefer:
- mv_fast_stream: Does what it says, maybe even a more verbose name?
- append_only: To tell how data is filled. This could also be a hint for
future optimizations like CASSANDRA-9779
 but would not allow
me just to tell CS to do that kind of streaming no matter how I treat my
data

Also still to be considered in this ticket:
- With "fast streaming" MVs MUST be repaired separately and explicitly
- With "write path repairs" MVs MUST NOT be included in KS repairs. Not
only that this is unnecessary repair-work - it could (or probably will)
break the local consistency between base table and MV.
- Manual of views that are normally repaired through the write path of the
base table should at least log a warning like "Manually repairing a
material view may lead to inconsistencies"

I'd really love to get some feedback before putting more effort in.
Thanks!


[GitHub] cassandra pull request #95: Expire messages by a single Thread

2017-03-01 Thread christian-esken
GitHub user christian-esken opened a pull request:

https://github.com/apache/cassandra/pull/95

Expire messages by a single Thread

When queue expiration is done, one single Thread is elected to do the
work. Previously, all Threads would go in and do the same work,
producing high lock contention. The Thread reading from the Queue could
even be starved by not be able to acquire the read lock. CASSANDRA-13265

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/christian-esken/cassandra 13265b-3.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cassandra/pull/95.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #95


commit 448d7a9c1430d9f23bd819895fc618bb227ca833
Author: Christian Esken 
Date:   2017-03-01T14:56:36Z

Expire messages by a single Thread

When queue expiration is done, one single Thread is elected to do the
work. Previously, all Threads would go in and do the same work,
producing high lock contention. The Thread reading from the Queue could
even be starved by not be able to acquire the read lock. CASSANDRA-13265




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Official CS docs

2017-03-01 Thread benjamin roth
Thanks for the information!

I think this approach works very well for purely technical docs like CQL
documentation.
For building a growing knowledge base I think the git repo is too
developer-centric.
A good example is that question about "fire drill" on the user list. There
is someone who would like to contribute but probably not as a developer. So
that process is a bit awkward for him.
Imageine of an FAQ or troubleshooting section with 100s or 1000s of
entries. Doing that in files will either end up with unmaintainable large
files or large directories. In this case we aren't talking any more of
"some docs" that describe things that are closely related to code but a
database that helps people to answer questions and to solve problems.
There are many topics on the user list that pop up every now and then. If
there was a database that could hold many answers of already asked
questions, that would be a great thing.
There are a lot of resources out there that answer a lot of questions but
they are hard to find and filter - especially for newbies. Having a central
and official place for that kind of resources would just be awesome. And
IMHO the hurdle to contribute should be as low as possible.

That said, I unfortunately don't have the perfect solution out of the box.
I can understand why you chose .rst for the (technical) docs but what I
have in mind goes beyond that. Generally I think of it more as a DB with
taggable articles (since version, until version, related to, see also, ...)
than a pure document structure. You could also do that in a WIKI but I'm
not sure if this is the best solution.
What solution could work also depends if a technical doc (kind of
reference) and a knowledge base should stay in the same system or if they
are separated. I think it is not uncommon to separate purely technical docs
from "support sections".
I will think about it - maybe others as well - and if I see a proper
solution, I will bring it to the table. These kind of things can't be
decided or solved within a few minutes, ideas have to grow first.

2017-03-01 13:11 GMT+01:00 Stefan Podkowinski :

> Hi Benjamin
>
> I think the best way to catch up with the motivation behind this is by
> reading the following dev post and linked jiras:
>
> https://lists.apache.org/thread.html/029e1273675260630e4973ba301f71
> a8de5a9d7e294a7b2df6eed65f@%3Cdev.cassandra.apache.org%3E
>
> What are your suggestions to improve the documentation? I think it's
> fair to say that the official docs still leave a lot to be desired. But
> wikis or any other publishing tools each have their on strengths and
> drawbacks. Do you have any example project with a process that we should
> follow instead? Did you have a look at the README file in the docs tree
> and actually try to add or change any content? What would hold you back
> to work from there and submit a patch?
>
>
>
> On 01.03.2017 11:10, benjamin roth wrote:
> > Hi guys,
> >
> > Is there a reason that the docs are part of the git repo?
> > In my personal opinion this is very complicated and it puts the hurdle to
> > contribute to docs very high.
> >
> > There are so many questions on userlists that repeat over and over again
> > and that could be put into a knowledge base.
> >
> > But ...
> > - Maintaining this in a repo is a painful, complicated and slow.
> > - I don't like to write docs that I can't preview instantly. I don't want
> > to wait for a slow deployment process to see my result.
> > - There are tons of solutions for agile and moderated document management
> > like wikis or CMS.
> > - Doc access is not bound to contribution access and can be handled more
> > relaxed.
> >
> > One thing that supports my consideration is the fact that the official
> doc
> > site is sparse and contains a lot of TODOs or "Under construction"
> entries.
> >
> > IMHO Doc vs Source is like userlist vs devlist.
> >
> > Any thoughts?
> >
> > Cheers,
> > Ben
> >
>


Re: Official CS docs

2017-03-01 Thread Stefan Podkowinski
Hi Benjamin

I think the best way to catch up with the motivation behind this is by
reading the following dev post and linked jiras:

https://lists.apache.org/thread.html/029e1273675260630e4973ba301f71a8de5a9d7e294a7b2df6eed65f@%3Cdev.cassandra.apache.org%3E

What are your suggestions to improve the documentation? I think it's
fair to say that the official docs still leave a lot to be desired. But
wikis or any other publishing tools each have their on strengths and
drawbacks. Do you have any example project with a process that we should
follow instead? Did you have a look at the README file in the docs tree
and actually try to add or change any content? What would hold you back
to work from there and submit a patch?



On 01.03.2017 11:10, benjamin roth wrote:
> Hi guys,
> 
> Is there a reason that the docs are part of the git repo?
> In my personal opinion this is very complicated and it puts the hurdle to
> contribute to docs very high.
> 
> There are so many questions on userlists that repeat over and over again
> and that could be put into a knowledge base.
> 
> But ...
> - Maintaining this in a repo is a painful, complicated and slow.
> - I don't like to write docs that I can't preview instantly. I don't want
> to wait for a slow deployment process to see my result.
> - There are tons of solutions for agile and moderated document management
> like wikis or CMS.
> - Doc access is not bound to contribution access and can be handled more
> relaxed.
> 
> One thing that supports my consideration is the fact that the official doc
> site is sparse and contains a lot of TODOs or "Under construction" entries.
> 
> IMHO Doc vs Source is like userlist vs devlist.
> 
> Any thoughts?
> 
> Cheers,
> Ben
> 


Re: What would a pluggable logging implementation look like?

2017-03-01 Thread Romain Hardouin
Hi,
I think you have to look at how authenticator/authorizer/role_manager are 
handled.e.g. 
https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L103https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/AllowAllAuthenticator.java
Best,Romain 
Le Mercredi 1 mars 2017 6h19, Murukesh Mohanan  
a écrit :
 
 I'm looking at CASSANDRA-13001 (pluggable slow query logging / handling). I 
wrote a hacky patch, where my main goal was to touch as few files as possible - 
so I did what I could within MonitoringTask, mostly. However, it seems that I 
completely misunderstood what the feature request was. Jon Haddad noted that 
pluggable means:

> 1. It's going to be java code
> 2. the pluggable thing implements an interface defined in cassandra.
> 3. the class would be compiled and dropped in lib (loaded into classpath 
> automatically)
> 4. The class can be specified in the yaml and is loaded by Class.forName() to 
> pull the interface in
> 
> We would need to convert the current slow query logger into a class of the 
> defined interface and have it be the default if no class is specified in the 
> yaml.

Can someone point me to an existing implementation of this, that I can learn 
from? A previous patch that contributed something similar, perhaps?


   

Official CS docs

2017-03-01 Thread benjamin roth
Hi guys,

Is there a reason that the docs are part of the git repo?
In my personal opinion this is very complicated and it puts the hurdle to
contribute to docs very high.

There are so many questions on userlists that repeat over and over again
and that could be put into a knowledge base.

But ...
- Maintaining this in a repo is a painful, complicated and slow.
- I don't like to write docs that I can't preview instantly. I don't want
to wait for a slow deployment process to see my result.
- There are tons of solutions for agile and moderated document management
like wikis or CMS.
- Doc access is not bound to contribution access and can be handled more
relaxed.

One thing that supports my consideration is the fact that the official doc
site is sparse and contains a lot of TODOs or "Under construction" entries.

IMHO Doc vs Source is like userlist vs devlist.

Any thoughts?

Cheers,
Ben