Re: Reading from big partitions

2018-05-20 Thread onmstester onmstester
Data spread between a SSD disk and a 15K disk.

the table has 26 tables totally.

I haven't try tracing, but i will and inform you!


Sent using Zoho Mail






 On Sun, 20 May 2018 08:26:33 +0430 Jonathan Haddad 
 wrote 




What disks are you using? How many sstables are you hitting? Did you try 
tracing the request?











Re: Reading from big partitions

2018-05-20 Thread onmstester onmstester
I've increased column_index_size_in_kb to 512 and then 4096 : no change in 
response time, it even got worse.

Even increasing Key cache size and Row cache size did not help.


Sent using Zoho Mail






 On Sun, 20 May 2018 08:52:03 +0430 Jeff Jirsa  
wrote 




Column index size in the yaml (increase it to trade GC pressure for disk IO)



If you’re on anything other than 3.11.x, upgrade to 3.11.newest







-- 

Jeff Jirsa












Re: Reading from big partitions

2018-05-20 Thread onmstester onmstester
Should i run compaction after changing column_index_size_in_kb?


Sent using Zoho Mail






 On Sun, 20 May 2018 15:06:57 +0430 onmstester onmstester 
 wrote 




I've increased column_index_size_in_kb to 512 and then 4096 : no change in 
response time, it even got worse.

Even increasing Key cache size and Row cache size did not help.



Sent using Zoho Mail






 On Sun, 20 May 2018 08:52:03 +0430 Jeff Jirsa  
wrote 




Column index size in the yaml (increase it to trade GC pressure for disk IO)



If you’re on anything other than 3.11.x, upgrade to 3.11.newest







-- 

Jeff Jirsa

















IN clause of prepared statement

2018-05-20 Thread onmstester onmstester
The table is  something like



Samples 

...
partition key (partition,resource,(timestamp,metric_name)



creating prepared statement :

session.prepare("select * from samples where partition=:partition and 
resource=:resource and timestamp>=:start and timestamp<=:end and 
metric_name in :metric_names")



failed with  exception:



can not restrict clustering columns by IN relations when a collection is 
selected by the query



The query is OK using cqlsh. using column names in select did not help.

Is there anyway to achieve this in Cassandra? I'm aware of performance problems 
of this query but it does not matter in my case!


I'm using datastax driver 3.2 and Apache cassandra 3.11.2
Sent using Zoho Mail







Re: Question About Reaper

2018-05-20 Thread Abdul Patel
Hi,

I recently tested reaper and it actually helped us alot. Even with our
small footprint 18 node reaper takes close to 6 hrs.. But it really depends on number nodes. For
example if you have 4 nodes then it runs on 4*256 =1024 segements ,
so for your env. Ut will be 256*144 close to 36k segements.
Better test on poc box how much time it takes and then proceed further ..i
have tested so far in 1 dc only , we can actually have seperate reaper
instance handling seperate dc but havent tested it yet.

On Sunday, May 20, 2018, Surbhi Gupta  wrote:

> Hi,
>
> We have a cluster with 144 nodes( 3 datacenter) with 256 Vnodes .
> When we tried to start repairs from opscenter then it showed 1.9Million
> ranges to repair .
> And even after doing compaction and strekamthroughput to 0 , opscenter is
> not able to help us much to finish repair in 9 days timeframe .
>
> What is your thought on Reaper ?
> Do you think , Reaper might be able to help us in this scenario ?
>
> Thanks
> Surbhi
>


Re: Question About Reaper

2018-05-20 Thread Jonathan Haddad
FWIW the largest deployment I know about is a single reaper instance
managing 50 clusters and over 2000 nodes.

There might be bigger, but I either don’t know about it or can’t remember.

On Sun, May 20, 2018 at 10:04 AM Abdul Patel  wrote:

> Hi,
>
> I recently tested reaper and it actually helped us alot. Even with our
> small footprint 18 node reaper takes close to 6 hrs. ,i was able to tune it 50%>. But it really depends on number nodes. For
> example if you have 4 nodes then it runs on 4*256 =1024 segements ,
> so for your env. Ut will be 256*144 close to 36k segements.
> Better test on poc box how much time it takes and then proceed further ..i
> have tested so far in 1 dc only , we can actually have seperate reaper
> instance handling seperate dc but havent tested it yet.
>
>
> On Sunday, May 20, 2018, Surbhi Gupta  wrote:
>
>> Hi,
>>
>> We have a cluster with 144 nodes( 3 datacenter) with 256 Vnodes .
>> When we tried to start repairs from opscenter then it showed 1.9Million
>> ranges to repair .
>> And even after doing compaction and strekamthroughput to 0 , opscenter is
>> not able to help us much to finish repair in 9 days timeframe .
>>
>> What is your thought on Reaper ?
>> Do you think , Reaper might be able to help us in this scenario ?
>>
>> Thanks
>> Surbhi
>>
> --
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade


RE: dtests failing with - ValueError: unsupported hash type md5

2018-05-20 Thread Rajiv Dimri
This Issue resolved:

 

Since I only had Python 3.6 installed on my machine cqlsh was referring to some 
default location when it tried to call python2.7 libraries.

Obviously these libraries did not have all the dependencies installed for cqlsh.

 

In order to make a valid environment for cqlsh to operate,I had to install a 
python2.7 along with dependent packages (cqlsh and Cassandra-driver) locally on 
the server.

Basically 2 python versions at the same time.

 

Once this was done I had to hide the python2.7 path behind the python3.6 path 
before pytest are initiated.

export PATH=$PYTHON3_INSTALL_PATH/bin:$PYTHON27_INSTALL_PATH/bin:$PATH

 

This seems to have resolved the issue.

 

From: Rajiv Dimri 
Sent: Thursday, May 10, 2018 11:52 AM
To: user@cassandra.apache.org
Subject: RE: dtests failing with - ValueError: unsupported hash type md5

 

Thank you for the response,

 

Single test command

pytest --cassandra-dir=$CASSANDRA_HOME 
cql_tracing_test.py::TestCqlTracing::test_tracing_simple

 

pytest is being run from within the virtual env (Python 3.6.5)

however cqlsh is a part of Cassandra distribution present in $CASSANDRA_HOME/bin

Even if I install cqlsh in the virtualenv, node.py in ccmlib will pick cqlsh 
present in the Cassandra source directory.

 

From: kurt greaves mailto:k...@instaclustr.com"k...@instaclustr.com> 
Sent: Thursday, May 10, 2018 11:37 AM
To: User mailto:user@cassandra.apache.org"user@cassandra.apache.org>
Subject: Re: dtests failing with - ValueError: unsupported hash type md5

 

What command did you run? Probably worth checking that cqlsh is installed in 
the virtual environment and that you are executing pytest from within the 
virtual env.

 

On 10 May 2018 at 05:06, Rajiv Dimri mailto:rajiv.di...@oracle.com"rajiv.di...@oracle.com> wrote:

Hi All,

 

We have setup a dtest environment to run against Cassandra db version 3.11.1 
and 3.0.5

As per instruction on HYPERLINK 
"https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_cassandra-2Ddtest&d=DwMFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=upnXohQU-Prt8noQKNsgGweRpR-6zqCD_G43z-KiGAY&m=edPZKKf-HAxUyTYlDyt3iCMAFmUdLAoET9BZE00hZ3o&s=8cTnSjXdX2IoD95YkkFKRGCRj5KwXGGSzZke8KLO_ag&e="https://github.com/apache/cassandra-dtest
 we have setup the environment with python 3.6.5 along with other dependencies.

The server used is Oracle RHEL (Red Hat Enterprise Linux Server release 6.6 
(Santiago))

 

During the multiple tests are failing with specific error as mentioned below.

 

process = , cmd_args = ['cqlsh', 
'TRACING ON', None]

 

    def handle_external_tool_process(process, cmd_args):

    out, err = process.communicate()

    rc = process.returncode

 

    if rc != 0:

>   raise ToolError(cmd_args, rc, out, err)

E   ccmlib.node.ToolError: Subprocess ['cqlsh', 'TRACING ON', None] 
exited with non-zero status; exit status: 1;

E   stderr: ERROR:root:code for hash md5 was not found.

E   Traceback (most recent call last):

E File 
"/ade_autofs/ade_infra/nfsdo_linux.x64/PYTHON/2.7.8/LINUX.X64/141106.0120/python/lib/python2.7/hashlib.py",
 line 139, in 

E   globals()[__func_name] = __get_hash(__func_name)

E File 
"/ade_autofs/ade_infra/nfsdo_linux.x64/PYTHON/2.7.8/LINUX.X64/141106.0120/python/lib/python2.7/hashlib.py",
 line 91, in __get_builtin_constructor

E   raise ValueError('unsupported hash type ' + name)

E   ValueError: unsupported hash type md5

E   ERROR:root:code for hash sha1 was not found.

E   Traceback (most recent call last):

E File 
"/ade_autofs/ade_infra/nfsdo_linux.x64/PYTHON/2.7.8/LINUX.X64/141106.0120/python/lib/python2.7/hashlib.py",
 line 139, in 

E   globals()[__func_name] = __get_hash(__func_name)

E File 
"/ade_autofs/ade_infra/nfsdo_linux.x64/PYTHON/2.7.8/LINUX.X64/141106.0120/python/lib/python2.7/hashlib.py",
 line 91, in __get_builtin_constructor

E   raise ValueError('unsupported hash type ' + name)

E   ValueError: unsupported hash type sha1

E   ERROR:root:code for hash sha224 was not found.

E   Traceback (most recent call last):

E File 
"/ade_autofs/ade_infra/nfsdo_linux.x64/PYTHON/2.7.8/LINUX.X64/141106.0120/python/lib/python2.7/hashlib.py",
 line 139, in 

E   globals()[__func_name] = __get_hash(__func_name)

E File 
"/ade_autofs/ade_infra/nfsdo_linux.x64/PYTHON/2.7.8/LINUX.X64/141106.0120/python/lib/python2.7/hashlib.py",
 line 91, in __get_builtin_constructor

E   raise ValueError('unsupported hash type ' + name)

E   ValueError: unsupported hash type sha224

E   ERROR:root:code for hash sha256 was not found.

E   Traceback (most recent call last):

E File 
"/ade_autofs/ade_infra/nfsdo_linux.x64/PYTHON/2.7.8/LINUX.X64/141106.0120/python/lib/python2.7/hashlib.py",
 line 139, in 

E 

Resource Intensive and Upgrade tests (Dtest Cassandra)

2018-05-20 Thread Rajiv Dimri
Hi All,

 

We have setup a dtest environment to run against Cassandra db version 3.11.1 
and 3.0.5

During the test run we see that tests under the category of 

Resource intensive and Upgrade tests

Are skipped by default during the collection phase.

 

And this is a huge number (1100+ out of the 1900+ tests written) more than 50%

 

If we have to enable these tests then we have to use 
'--force-resource-intensive-tests' and '--execute-upgrade-tests' option.

 

My questions are as below:

Do you recommend, turning on these tests using the options above?

Are there any special case/scenarios when these tests need to be run?

What should be the minimum system configuration for the resource intensive 
tests to run without any issues?

 

Regards,

Rajiv 

 


Re: Question About Reaper

2018-05-20 Thread Surbhi Gupta
Thanks a lot for your inputs,
Abdul, how did u tune reaper?

On Sun, May 20, 2018 at 10:10 AM Jonathan Haddad  wrote:

> FWIW the largest deployment I know about is a single reaper instance
> managing 50 clusters and over 2000 nodes.
>
> There might be bigger, but I either don’t know about it or can’t remember.
>
> On Sun, May 20, 2018 at 10:04 AM Abdul Patel  wrote:
>
>> Hi,
>>
>> I recently tested reaper and it actually helped us alot. Even with our
>> small footprint 18 node reaper takes close to 6 hrs.> ,i was able to tune it 50%>. But it really depends on number nodes. For
>> example if you have 4 nodes then it runs on 4*256 =1024 segements ,
>> so for your env. Ut will be 256*144 close to 36k segements.
>> Better test on poc box how much time it takes and then proceed further
>> ..i have tested so far in 1 dc only , we can actually have seperate reaper
>> instance handling seperate dc but havent tested it yet.
>>
>>
>> On Sunday, May 20, 2018, Surbhi Gupta  wrote:
>>
>>> Hi,
>>>
>>> We have a cluster with 144 nodes( 3 datacenter) with 256 Vnodes .
>>> When we tried to start repairs from opscenter then it showed 1.9Million
>>> ranges to repair .
>>> And even after doing compaction and strekamthroughput to 0 , opscenter
>>> is not able to help us much to finish repair in 9 days timeframe .
>>>
>>> What is your thought on Reaper ?
>>> Do you think , Reaper might be able to help us in this scenario ?
>>>
>>> Thanks
>>> Surbhi
>>>
>>>
>>> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>
>
>