Re: CQL Collections appear slow

2016-08-31 Thread Ben Frank
Thanks Tyler!
I wasn't aware of frozen collections - the tracing shows pretty similar
timing characteristics between frozen collection and binary schemas.
Interestingly it's still dog slow while (presumably) doing the
deserialization in python, so although the trace reports good results it's
still taking ~3 seconds to load data into python wall clock time. Anyway -
thanks for the answer, really appreciate it.

-Ben

On Wed, Aug 31, 2016 at 8:55 AM, Tyler Hobbs  wrote:

> The map version of the schema needs to deserialize, serialize, and then
> deserialize about 85 times more cells, if your average map has 85
> elements.  I would assume that's where most of the performance slowdown is
> coming from.  If you can take the time to run that through a profiler, that
> would be useful to see if there is some unexpected inefficiency.
>
> I'll also point out that you could use a frozen map (e.g. frozen float>>) and you'd probably get performance that's somewhere in the middle
> of the other two approaches.
>
> On Tue, Aug 30, 2016 at 8:00 PM, Ben Frank  wrote:
>
> > Hi all, I posted this question on stackoverflow - I'm having an issue
> with
> > CQL collections, anyone got any insight here?
> >
> > (http://stackoverflow.com/questions/39218180/cql-collections-appear-slow
> )
> >
> > I'm playing around with storing data in cassandra and I'm finding a
> > significant performance problem with CQL collections. I started with this
> > schema:
> >
> > CREATE TABLE TEST (
> >   date DATE,
> >   tranche TEXT,
> >   id INT,
> >   properties MAP,
> >   PRIMARY KEY ((date,tranche), id))
> >
> > if I run a query for all data in this partition
> >
> > SELECT * FROM TEST where date = "2016-08-26" and tranche = "third"
> >
> > tracing reports it takes ~1.3 seconds to load 15K rows. There are about
> 85
> > entries in the map. Wall clock time from python is ~5 seconds. This seems
> > really slow to load just one 'partition'
> >
> > So I tried this schema instead and used message pack to store the entire
> > map in a single cell
> >
> > CREATE TABLE TEST (
> >   date DATE,
> >   tranche TEXT,
> >   id INT,
> >   properties blob,
> >   PRIMARY KEY ((date,tranche), id))
> >
> > Now the same query takes ~60ms (as reported by tracing) and ~500ms wall
> > clock time (again using python)
> >
> > I get that there's more to do with the MAP version, but this seems like
> an
> > unexpected performance degradation.
> >
> > One oddity I noticed while testing this was that in both cases tracing
> > reported it was returning 15K cells (which corresponds to the number of
> > rows). I'd expect this in the second schema, but my understanding was
> that
> > each element in a map was stored in it's own cell in current versions of
> > cassandra, so a bit surprised by this.
> >
> > I'm using version 3.7 of cassandra and the datastax python drivers.
> Anyone
> > got any insight into what happening here?
> >
> > -Ben
> >
>
>
>
> --
> Tyler Hobbs
> DataStax 
>


Re: Help requested: o.a.c.cql3.validation unit tests periodically hang

2016-08-31 Thread Michael Shuler
Another couple ABORT examples have presented themselves, tonight, one
that has logs.

Usually we'll see unit tests finish similar to:

01:57:39 [junit] Testsuite:
org.apache.cassandra.cql3.statements.PropertyDefinitionsTest
01:57:39 [junit] Testsuite:
org.apache.cassandra.cql3.statements.PropertyDefinitionsTest Tests run:
2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.362 sec

This trunk_testall job has 4 different tests, none of which finished
with the "Tests run:..." output.

http://cassci.datastax.com/job/trunk_testall/1158/console

Those log files are:

TEST-org.apache.cassandra.cql3.validation.entities.CountersTest.log
TEST-org.apache.cassandra.cql3.ViewTest.log
TEST-org.apache.cassandra.cql3.ViewFilteringTest.log
TEST-org.apache.cassandra.cql3.validation.entities.CollectionsTest.log

CountersTest appears to have never really started running. ViewTest and
ViewFilteringTest both logged shutdown entries. The CollectionsTest log
shows a leak error at the end.

ERROR [Strong-Reference-Leak-Detector:1] 2016-09-01 01:58:43,170 Strong
self-ref loop detected..

Logs are here:
http://cassci.datastax.com/job/trunk_testall/1158/artifact/jenkins-trunk_testall-1158_logs.tar.gz

These are completely different concurrently running tests than the last
ABORT I posted from trunk, which is why I'm asking for help getting to
the bottom of this. I have yet to find a rhyme or reason to these job halts.

The other ABORT was
http://cassci.datastax.com/job/cassandra-2.2_testall/578/console

This appears to have hung on the cql3.DropKeyspaceCommitLogRecycleTest
long-test, and failed to fetch logs.

-- 
Kind regards,
Michael


Failing tests 2016-08-31

2016-08-31 Thread Joel Knighton
cassandra-3.9
===
testall: All passed!

===
dtest: 2 failures
  repair_tests.repair_test.TestRepair.nonexistent_table_repair_test
CASSANDRA-12578. New failure, looks like a test problem.
  cql_tracing_test.TestCqlTracing.tracing_default_impl_test
CASSANDRA-12579. New failure, looks like a test problem.

===
novnode: 2 failures - same as vnode dtests above

===
upgrade: 1 failure
  upgrade_tests.paging_test
  .TestPagingDataNodes2RF1_Upgrade_current_2_2_x_To_indev_3_x
  .static_columns_paging_test
  CASSANDRA-11195. Work is under way to understand this failure.


trunk
===
testall: All passed!

===
dtest: All passed!

===
novnode: 1 failure
  cdc_test.TestCDC.test_cdc_data_available_in_cdc_raw
CASSANDRA-11811.  Known issue under investigation.

===
upgrade:
  Failed due to environmental problems that are under investigation.


ASF Infra
  No significant updates. 13 failures in today's trunk testall run, all due
  to timeouts or port binding issues.


Re: CQL Collections appear slow

2016-08-31 Thread Tyler Hobbs
The map version of the schema needs to deserialize, serialize, and then
deserialize about 85 times more cells, if your average map has 85
elements.  I would assume that's where most of the performance slowdown is
coming from.  If you can take the time to run that through a profiler, that
would be useful to see if there is some unexpected inefficiency.

I'll also point out that you could use a frozen map (e.g. frozen>) and you'd probably get performance that's somewhere in the middle
of the other two approaches.

On Tue, Aug 30, 2016 at 8:00 PM, Ben Frank  wrote:

> Hi all, I posted this question on stackoverflow - I'm having an issue with
> CQL collections, anyone got any insight here?
>
> (http://stackoverflow.com/questions/39218180/cql-collections-appear-slow)
>
> I'm playing around with storing data in cassandra and I'm finding a
> significant performance problem with CQL collections. I started with this
> schema:
>
> CREATE TABLE TEST (
>   date DATE,
>   tranche TEXT,
>   id INT,
>   properties MAP,
>   PRIMARY KEY ((date,tranche), id))
>
> if I run a query for all data in this partition
>
> SELECT * FROM TEST where date = "2016-08-26" and tranche = "third"
>
> tracing reports it takes ~1.3 seconds to load 15K rows. There are about 85
> entries in the map. Wall clock time from python is ~5 seconds. This seems
> really slow to load just one 'partition'
>
> So I tried this schema instead and used message pack to store the entire
> map in a single cell
>
> CREATE TABLE TEST (
>   date DATE,
>   tranche TEXT,
>   id INT,
>   properties blob,
>   PRIMARY KEY ((date,tranche), id))
>
> Now the same query takes ~60ms (as reported by tracing) and ~500ms wall
> clock time (again using python)
>
> I get that there's more to do with the MAP version, but this seems like an
> unexpected performance degradation.
>
> One oddity I noticed while testing this was that in both cases tracing
> reported it was returning 15K cells (which corresponds to the number of
> rows). I'd expect this in the second schema, but my understanding was that
> each element in a map was stored in it's own cell in current versions of
> cassandra, so a bit surprised by this.
>
> I'm using version 3.7 of cassandra and the datastax python drivers. Anyone
> got any insight into what happening here?
>
> -Ben
>



-- 
Tyler Hobbs
DataStax 


Help requested: o.a.c.cql3.validation unit tests periodically hang

2016-08-31 Thread Michael Shuler
Jenkins jobs in ABORTED status are bad. A lot of times they tend to be
ignored/re-run to get completed results, which is OK, only if the reason
is due to server setup or configuration problems. There's a relatively
recent pattern of o.a.c.cql3.validation test hanging up jobs, and I've
been unable to pin it down to a single test that causes the problem.
Manually looping over the test class (multiple attempts over days,
sometimes) has resulted in no reproduction.

Identification of which actual test hung the job can be difficult. If we
look at the test trend graph or drill down the past runs of
trunk_testall, there are multiple job aborts, and it appears to be in
the cql3.validation tests. Here's an example:

http://cassci.datastax.com/job/trunk_testall/1153/console

At 06:08:10, we see "Build timed out (after 20 minutes)." Since we run
with `-Dtest.runners=4`, keep in mind that there are 4 tests running in
parallel, so that last test before it aborts? (SSTablesIteratedTest? or
the other 2 listed right there above that?) ..nope, those are not the
tests you're looking for.. (snippet of log attached for posterity).

Usually, the best way to determine which test hung things up is to grab
the log.tar.gz and look at the most recent log files (included in
snippet). It looks to me like SSTablesIteratedTest,
SSTableMetadataTrackingTest, and RoleSyntaxTest completed.

It appears to me, and I still might not even be correct here, that the
test that hung us up was cql3.validation.entities.UFTest, which ends up
buried quite a few lines up (started at 05:47:16) from the job abort, so
is difficult to spot just reading the console.

This is the logs.tar.gz for this job, if you want to grab it and help
figure out why cql3 tests hang us up from time to time (will be deleted
after a time):

http://cassci.datastax.com/job/trunk_testall/1153/artifact/jenkins-trunk_testall-1153_logs.tar.gz

The help I'm looking for is 1) why are we getting cql3 test hangs, which
so far have been unreproducible, we need to fix these; and 2) are there
perhaps any better ways to log success in a test run? Generally, I need
to dig in tests to see what they are doing, since logged tracebacks,
etc. may actually be intended, so finding a trace in a test log adds
time and may be an empty rabbit hole. And I suppose 3) verifying if
UFTest was indeed the one that hung the job - I think this was the right
test :)

Thanks for looking!
Michael
mshuler@hana:~/tmp/logs$ ls -lRt
.:
total 2776
drwxr-xr-x 2 mshuler mshuler   12288 Aug 31 00:48 logs
-rw-r--r-- 1 mshuler mshuler 2825465 Aug 31 08:24 
jenkins-trunk_testall-1153_logs.tar.gz

./logs:
total 31928
-rw-r--r-- 1 mshuler mshuler   63447 Aug 31 00:52 
TEST-org.apache.cassandra.cql3.validation.miscellaneous.SSTablesIteratedTest.log
-rw-r--r-- 1 mshuler mshuler   88824 Aug 31 00:52 
TEST-org.apache.cassandra.cql3.validation.miscellaneous.SSTableMetadataTrackingTest.log
-rw-r--r-- 1 mshuler mshuler  167138 Aug 31 00:52 
TEST-org.apache.cassandra.cql3.validation.miscellaneous.RoleSyntaxTest.log
-rw-r--r-- 1 mshuler mshuler 2466867 Aug 31 00:51 
TEST-org.apache.cassandra.cql3.validation.entities.UFTest.log
-rw-r--r-- 1 mshuler mshuler  123861 Aug 31 00:48 
TEST-org.apache.cassandra.cql3.validation.miscellaneous.PgStringTest.log
-rw-r--r-- 1 mshuler mshuler  599070 Aug 31 00:48 
TEST-org.apache.cassandra.cql3.validation.miscellaneous.OverflowTest.log
-rw-r--r-- 1 mshuler mshuler 1617350 Aug 31 00:48 
TEST-org.apache.cassandra.cql3.validation.entities.UserTypesTest.log
-rw-r--r-- 1 mshuler mshuler  355408 Aug 31 00:47 
TEST-org.apache.cassandra.cql3.validation.miscellaneous.CrcCheckChanceTest.log
-rw-r--r-- 1 mshuler mshuler  431485 Aug 31 00:47 
TEST-org.apache.cassandra.cql3.validation.entities.UFVerifierTest.log
<...>


===


<...>
05:46:01 [junit] 
05:46:01[delete] Deleting directory 
/home/automaton/cassandra/build/test/cassandra/commitlog:53
05:46:01[delete] Deleting directory 
/home/automaton/cassandra/build/test/cassandra/data:53
05:46:01 [junit] Testsuite: 
org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest
05:46:02[delete] Deleting directory 
/home/automaton/cassandra/build/test/cassandra/saved_caches:53
05:46:02[delete] Deleting directory 
/home/automaton/cassandra/build/test/cassandra/hints:53
05:46:02 [junit] WARNING: multiple versions of ant detected in path for 
junit 
05:46:02 [junit]  
jar:file:/usr/share/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
05:46:02 [junit]  and 
jar:file:/home/automaton/cassandra/build/lib/jars/ant-1.9.4.jar!/org/apache/tools/ant/Project.class
05:46:03 [junit] Testsuite: 
org.apache.cassandra.cql3.validation.entities.JsonTest Tests run: 11, Failures: 
0, Errors: 0, Skipped: 0, Time elapsed: 25.948 sec
05:46:03 [junit] 
05:46:03[delete] Deleting directory 

Re: #cassandra-dev IRC logging

2016-08-31 Thread Jake Farrell
ASFBot is now active and logging in #cassandra-dev to
http://wilderness.apache.org/channels/#logs-#cassandra-dev

-Jake

On Tue, Aug 30, 2016 at 4:21 PM, Jake Farrell  wrote:

> just #cassandra-dev
>
> On Tue, Aug 30, 2016 at 4:17 PM, Jeremiah D Jordan <
> jeremiah.jor...@gmail.com> wrote:
>
>> Also just to make sure, this is logging for #cassandra-dev not #cassandra
>> right?
>>
>> -Jeremiah
>>
>> On Aug 30, 2016, at 3:11 PM, Jeff Jirsa 
>> wrote:
>>
>> http://wilderness.apache.org/channels/
>>
>>
>> On 8/30/16, 1:04 PM, "Jonathan Ellis"  wrote:
>>
>> What is the process to access asfbot logs?
>>
>> On Tue, Aug 30, 2016 at 3:03 PM, Jake Farrell 
>> wrote:
>>
>> If there are no objections then, I am going to enable ASFBot and logging
>> in
>> #cassandra on freenode
>>
>> -Jake
>>
>> On Fri, Aug 26, 2016 at 6:59 PM, Dave Brosius 
>> wrote:
>>
>> If you wish to unsubscribe, send an email to
>>
>> mailto://dev-unsubscr...@cassandra.apache.org
>> 
>>
>>
>> On 08/26/2016 04:49 PM, Gvb Subrahmanyam wrote:
>>
>> Please remove me from - dev@cassandra.apache.org
>>
>> -Original Message-
>> From: Jake Farrell [mailto:jfarr...@apache.org]
>> Sent: Friday, August 26, 2016 4:36 PM
>> To: dev@cassandra.apache.org
>> Subject: Re: #cassandra-dev IRC logging
>>
>> asfbot can log to wilderness for backup, but it does not send out
>>
>> digests.
>>
>> I've seen a couple of projects starting to test out and use
>>
>> slack/hipchat
>>
>> and then use sameroom to connect irc so conversations are not separated
>>
>> and
>>
>> people can use their favorite client of choice
>>
>> -Jake
>>
>> On Fri, Aug 26, 2016 at 4:20 PM, Edward Capriolo >
>>
>> wrote:
>>
>> Yes. I did. My bad.
>>
>>
>> On Fri, Aug 26, 2016 at 4:07 PM, Jason Brown 
>> wrote:
>>
>> Ed, did you mean this to post this to the other active thread today,
>>
>> the one about github pull requests? (just want to make sure I'm
>> understanding correctly :) )
>>
>> On Fri, Aug 26, 2016 at 12:28 PM, Edward Capriolo
>> >
>> wrote:
>>
>> One thing to watch out for. The way apache-gossip is setup the
>>
>> PR's get sent to the dev list. However the address is not part of
>> the list so
>>
>> the
>>
>>
>> project owners get an email asking to approve/reject every PR and
>>
>>
>> comment
>>
>>
>> on the PR.
>>
>>
>> This is ok because we have a small quite group but you probably do
>> not
>>
>> want
>>
>> that with the number of SCM changes in the cassandra project.
>>
>> On Fri, Aug 26, 2016 at 3:05 PM, Jeff Jirsa <
>>
>> jeff.ji...@crowdstrike.com>
>>
>>
>> wrote:
>>
>>
>> +1 to both as well
>>
>>
>> On 8/26/16, 11:59 AM, "Tyler Hobbs"  wrote:
>>
>> +1 on doing this and using ASFBot in particular.
>>
>>
>> On Fri, Aug 26, 2016 at 1:40 PM, Jason Brown
>> 
>>
>> wrote:
>>
>> @Dave ASFBot looks like a winner. If others are on board with
>>
>>
>> this,
>>
>>
>> I
>>
>> can
>>
>>
>> work on getting it up and going.
>>
>>
>> On Fri, Aug 26, 2016 at 11:27 AM, Dave Lester <
>>
>> dave_les...@apple.com>
>>
>>
>> wrote:
>>
>>
>> +1. Check out ASFBot for logging IRC, along with other
>>
>>
>> integrations.[1]
>>
>>
>>
>> 
>> 
>> Disclaimer:  This message and the information contained herein is
>> proprietary and confidential and subject to the Tech Mahindra policy
>> statement, you may review the policy at https://urldefense.proofpoi
>> nt.com/v2/url?u=http-3A__www.techmahindra.com_Di=DQIBaQ&
>> c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0
>> zImlOIBID0gmhluYPD5Jje-3CtaT3ow=H4wofC2Uy8qTZ9EqZUI
>> lT73N7caN-AfXR9CKCtbOYN0=og0lwkda3Lm-OSFcDxwh2eodgL0Xb71Dmxkhb2fIO7c=
>>
>> sclaimer.html externally https://urldefense.
>> proofpoint.com/v2/url?u=http-3A__tim.techmahindra.com_=DQI
>> BaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVk
>> X6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=H4wofC2Uy8qTZ9EqZUI
>> lT73N7caN-AfXR9CKCtbOYN0=0XHqu1FcXwN6F9jkHUWPCd7qLcHiJB4v_-ANnLyQuSU=
>>
>>
>> tim/disclaimer.html
>>
>> internally within TechMahindra.
>> 
>> 
>>
>>
>>
>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder, https://urldefense.proofpoint.com/v2/url?u=http-
>> 3A__www.datastax.com=DQIBaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6
>> kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-
>> 3CtaT3ow=H4wofC2Uy8qTZ9EqZUIlT73N7caN-AfXR9CKCtbOYN0=hI4
>> gwuphhNqvPjKsHd0UpNAntaLWAx776ddQzS5O-90=
>> @spyced
>>
>>
>>
>