[jira] [Commented] (CASSANDRA-4175) Reduce memory (and disk) space requirements with a column name/id map

2012-04-19 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257990#comment-13257990
 ] 

T Jake Luciani commented on CASSANDRA-4175:
---

Can't you use String.hashCode? it's portable.

> Reduce memory (and disk) space requirements with a column name/id map
> -
>
> Key: CASSANDRA-4175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4175
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
> Fix For: 1.2
>
>
> We spend a lot of memory on column names, both transiently (during reads) and 
> more permanently (in the row cache).  Compression mitigates this on disk but 
> not on the heap.
> The overhead is significant for typical small column values, e.g., ints.
> Even though we intern once we get to the memtable, this affects writes too 
> via very high allocation rates in the young generation, hence more GC 
> activity.
> Now that CQL3 provides us some guarantees that column names must be defined 
> before they are inserted, we could create a map of (say) 32-bit int column 
> id, to names, and use that internally right up until we return a resultset to 
> the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4147) cqlsh doesn't accept NULL as valid input

2012-04-13 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253456#comment-13253456
 ] 

T Jake Luciani commented on CASSANDRA-4147:
---

It's really syntactic sugar for people coming from SQL land. Internally it 
would just ignore it.

> cqlsh doesn't accept NULL as valid input
> 
>
> Key: CASSANDRA-4147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4147
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.0.8
>Reporter: T Jake Luciani
>Assignee: paul cannon
>Priority: Minor
> Fix For: 1.0.10
>
>
> cqlsh:cfs> insert into foo (key,val1,val2)values('row2',NULL,NULL);
> Bad Request: unable to make long from 'NULL'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3647) Support arbitrarily nested "documents" in CQL

2012-04-10 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250687#comment-13250687
 ] 

T Jake Luciani commented on CASSANDRA-3647:
---

Also, hive supports complex types we could model this after... 
https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-ComplexTypes

> Support arbitrarily nested "documents" in CQL
> -
>
> Key: CASSANDRA-3647
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3647
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>  Labels: cql
>
> Composite columns introduce the ability to have arbitrarily nested data in a 
> Cassandra row.  We should expose this through CQL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4131) Integrate Hive support to be in core cassandra

2012-04-09 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249961#comment-13249961
 ] 

T Jake Luciani commented on CASSANDRA-4131:
---

The latest code is https://github.com/riptano/hive/tree/hive-0.8.1-merge

The cassandra version should be trunk (1.1) since it uses same version thrift 
as hive 0.7.0

The only thing I want todo it put the CassandraProxyClient code into the main 
Cassandra tree and use that for hadoop calls since it's much more reliable for 
us.  The hive driver currently depends on it's own version of that class.



> Integrate Hive support to be in core cassandra
> --
>
> Key: CASSANDRA-4131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4131
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeremy Hanna
>Assignee: Edward Capriolo
>  Labels: hadoop, hive
>
> The standalone hive support (at https://github.com/riptano/hive) would be 
> great to have in-tree so that people don't have to go out to github to 
> download it and wonder if it's a left-for-dead external shim.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4118) ConcurrentModificationException in ColumnFamily.updateDigest(ColumnFamily.java:294) (cassandra 1.0.8)

2012-04-09 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249950#comment-13249950
 ] 

T Jake Luciani commented on CASSANDRA-4118:
---

Solandra on top of 1.0.8 doesn't really work because it breaks the custom 
partitioner.

It could be a Solandra bug though, nothing comes to mind that would cause this.


> ConcurrentModificationException in 
> ColumnFamily.updateDigest(ColumnFamily.java:294)  (cassandra 1.0.8)
> --
>
> Key: CASSANDRA-4118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4118
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.0.8
> Environment: two nodes, replication factor=2
>Reporter: Zklanu Ryś
>Assignee: Vijay
> Fix For: 1.0.10, 1.1.0
>
>
> Sometimes when reading data I receive them without any exception but I can 
> see in Cassandra logs, that there is an error:
> ERROR [ReadRepairStage:58] 2012-04-05 12:04:35,732 
> AbstractCassandraDaemon.java (line 139) Fatal exception in thread 
> Thread[ReadRepairStage:58,5,main]
> java.util.ConcurrentModificationException
> at 
> java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
> at java.util.AbstractList$Itr.next(AbstractList.java:343)
> at 
> org.apache.cassandra.db.ColumnFamily.updateDigest(ColumnFamily.java:294)
> at org.apache.cassandra.db.ColumnFamily.digest(ColumnFamily.java:288)
> at 
> org.apache.cassandra.service.RowDigestResolver.resolve(RowDigestResolver.java:102)
> at 
> org.apache.cassandra.service.RowDigestResolver.resolve(RowDigestResolver.java:30)
> at 
> org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.runMayThrow(ReadCallback.java:227)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4131) Integrate Hive support to be in core cassandra

2012-04-09 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249899#comment-13249899
 ] 

T Jake Luciani commented on CASSANDRA-4131:
---

I think most of the work will be making a stand along build.xml to fetch the 
hive maven artifacts and create the cassandra-handler.jar,  I think we just 
drop the hive test suite and integrate our own. 

> Integrate Hive support to be in core cassandra
> --
>
> Key: CASSANDRA-4131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4131
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeremy Hanna
>Assignee: Edward Capriolo
>  Labels: hadoop, hive
>
> The standalone hive support (at https://github.com/riptano/hive) would be 
> great to have in-tree so that people don't have to go out to github to 
> download it and wonder if it's a left-for-dead external shim.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3995) Support multiple indexes on single query

2012-03-27 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239736#comment-13239736
 ] 

T Jake Luciani commented on CASSANDRA-3995:
---

This patch seems to only account for the most restrictive expression.
It doesn't prune off the rows with columns that don't match the rest of the 
expressions.

It should do both to be correct otherwise you will get rows that don't actually 
match your all of you expressions

> Support multiple indexes on single query
> 
>
> Key: CASSANDRA-3995
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3995
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dmitry Petrashko
> Attachments: Support_Multiple_Indexes-v2.patch, 
> Support_Multiple_Indexes-v3.patch, Support_for_multiple_indexes.patch
>
>
> Currently if multiple secondary index types are available query is not 
> processed.
> Expected behavior executing query with index with best selectivity.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3844) Truncate leaves behind non-CFS backed secondary indexes

2012-02-07 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202442#comment-13202442
 ] 

T Jake Luciani commented on CASSANDRA-3844:
---

+1

> Truncate leaves behind non-CFS backed secondary indexes
> ---
>
> Key: CASSANDRA-3844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3844
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.0.7
>Reporter: T Jake Luciani
>Assignee: Pavel Yaskevich
>Priority: Minor
> Fix For: 1.0.8
>
> Attachments: CASSANDRA-3844.patch
>
>
> If you setup a CF with a non-cfs backed secondary index then trucate it, 
> nothing happens to the secondary index. we need a hook for CFStore to clean 
> these up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3844) Truncate leaves behind non-CFS backed secondary indexes

2012-02-07 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202434#comment-13202434
 ] 

T Jake Luciani commented on CASSANDRA-3844:
---

SIM.getIndexes() should use a IdenenityHashMap since PerRowSecondaryIndexes 
share the same instance across rows.

> Truncate leaves behind non-CFS backed secondary indexes
> ---
>
> Key: CASSANDRA-3844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3844
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.0.7
>Reporter: T Jake Luciani
>Assignee: Pavel Yaskevich
>Priority: Minor
> Fix For: 1.0.8
>
> Attachments: CASSANDRA-3844.patch
>
>
> If you setup a CF with a non-cfs backed secondary index then trucate it, 
> nothing happens to the secondary index. we need a hook for CFStore to clean 
> these up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3264) Add wide row paging for ColumnFamilyInputFormat and ColumnFamilyOutputFormat

2012-01-25 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193205#comment-13193205
 ] 

T Jake Luciani commented on CASSANDRA-3264:
---

updated patchset at: https://github.com/tjake/cassandra/tree/3264-3



> Add wide row paging for ColumnFamilyInputFormat and ColumnFamilyOutputFormat
> 
>
> Key: CASSANDRA-3264
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3264
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Hadoop
>Reporter: T Jake Luciani
>Assignee: Jonathan Ellis
>  Labels: hadoop
> Fix For: 1.1
>
>
> Hadoop input/output formats currently can OOM on wide rows.
> We can add a new option to the ConfigHelper like columnPagingSize with a 
> default of Integer.MAX_VALUE.
> The input format would page the row internally rather than pull it over at 
> once.
> The output format could also use this to avoid sending huge rows over at once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns and wide rows

2012-01-16 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186970#comment-13186970
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

bq. we're proposing making these changes late in the 11th hour of the release 
cycle, and without seeking input from the wider community

I don't understand. You emailed the community for input, we created a wiki 
explaining the options. We've run out of paint for the bikeshed.  Now that the 
work is being done we need to go back and re-explain the issue?

Have we stated either way to the community that the current form of CQL is 
going to be backwards compatible? I understand that its the end goal but we are 
not there yet. If it was considered *done* why are we discussing handling wide 
rows in the first place?

I think the best option here is to keep the old cql around in whatever form it 
currently is and start fresh with this transposed approach.  If/When the 
current users get around to changing syntax then we can drop it.




> CQL support for compound columns and wide rows
> --
>
> Key: CASSANDRA-2474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Eric Evans
>Assignee: Sylvain Lebresne
>Priority: Critical
>  Labels: cql
> Fix For: 1.1
>
> Attachments: 0001-Add-support-for-wide-and-composite-CFs.patch, 
> 0002-thrift-generated-code.patch, 2474-transposed-1.PNG, 
> 2474-transposed-raw.PNG, 2474-transposed-select-no-sparse.PNG, 
> 2474-transposed-select.PNG, cql_tests.py, raw_composite.txt, 
> screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns and wide rows

2012-01-12 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185228#comment-13185228
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

.bq My current patch don't really allow creating a secondary index on a non 
static CF, because it's not clear how that would work from the syntax and I'm 
sure I see good use for that. Does that bother someone ? 

CASSANDRA-3680 will add support for this at a later time

> CQL support for compound columns and wide rows
> --
>
> Key: CASSANDRA-2474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Eric Evans
>Assignee: Sylvain Lebresne
>Priority: Critical
>  Labels: cql
> Fix For: 1.1
>
> Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, 
> 2474-transposed-select-no-sparse.PNG, 2474-transposed-select.PNG, 
> cql_tests.py, raw_composite.txt, screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3507) Proposal: separate cqlsh from CQL drivers

2012-01-05 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180628#comment-13180628
 ] 

T Jake Luciani commented on CASSANDRA-3507:
---

oh, hmm you are right, i forgot we do both...

> Proposal: separate cqlsh from CQL drivers
> -
>
> Key: CASSANDRA-3507
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3507
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging, Tools
>Affects Versions: 1.0.3
> Environment: Debian-based systems
>Reporter: paul cannon
>Assignee: paul cannon
>Priority: Blocker
>  Labels: cql, cqlsh
> Fix For: 1.0.7
>
>
> Whereas:
> * It has been shown to be very desirable to decouple the release cycles of 
> Cassandra from the various client CQL drivers, and
> * It is also desirable to include a good interactive CQL client with releases 
> of Cassandra, and
> * It is not desirable for Cassandra releases to depend on 3rd-party software 
> which is neither bundled with Cassandra nor readily available for every 
> target platform, but
> * Any good interactive CQL client will require a CQL driver;
> Therefore, be it resolved that:
> * cqlsh will not use an official or supported CQL driver, but will include 
> its own private CQL driver, not intended for use by anything else, and
> * the Cassandra project will still recommend installing and using a proper 
> CQL driver for client software.
> To ease maintenance, the private CQL driver included with cqlsh may very well 
> be created by "copying the python CQL driver from one directory into 
> another", but the user shouldn't rely on this. Maybe we even ought to take 
> some minor steps to discourage its use for other purposes.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3507) Proposal: separate cqlsh from CQL drivers

2012-01-05 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180608#comment-13180608
 ] 

T Jake Luciani commented on CASSANDRA-3507:
---

It it was written in java then a dependency would be added to build.xml to pull 
jdbc in from maven. so at compile time it would download the jar.  from there 
it would be bundled.  So you get the benefit of a remote dependency without 
requiring packaging changes.

> Proposal: separate cqlsh from CQL drivers
> -
>
> Key: CASSANDRA-3507
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3507
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging, Tools
>Affects Versions: 1.0.3
> Environment: Debian-based systems
>Reporter: paul cannon
>Assignee: paul cannon
>Priority: Blocker
>  Labels: cql, cqlsh
> Fix For: 1.0.7
>
>
> Whereas:
> * It has been shown to be very desirable to decouple the release cycles of 
> Cassandra from the various client CQL drivers, and
> * It is also desirable to include a good interactive CQL client with releases 
> of Cassandra, and
> * It is not desirable for Cassandra releases to depend on 3rd-party software 
> which is neither bundled with Cassandra nor readily available for every 
> target platform, but
> * Any good interactive CQL client will require a CQL driver;
> Therefore, be it resolved that:
> * cqlsh will not use an official or supported CQL driver, but will include 
> its own private CQL driver, not intended for use by anything else, and
> * the Cassandra project will still recommend installing and using a proper 
> CQL driver for client software.
> To ease maintenance, the private CQL driver included with cqlsh may very well 
> be created by "copying the python CQL driver from one directory into 
> another", but the user shouldn't rely on this. Maybe we even ought to take 
> some minor steps to discourage its use for other purposes.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2012-01-05 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180387#comment-13180387
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

I'm fine with CQL only

> CQL support for compound columns
> 
>
> Key: CASSANDRA-2474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Eric Evans
>Assignee: Pavel Yaskevich
>  Labels: cql
> Fix For: 1.1
>
> Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, 
> 2474-transposed-select-no-sparse.PNG, 2474-transposed-select.PNG, 
> raw_composite.txt, screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-12-28 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176818#comment-13176818
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

bq. I think we can support sparse columns well in a way that improves the 
conceptual integrity for the dense composites as well

I think this is clean,  the only worry with DENSE never using a column value is 
it will make it hard for current users of composites to adopt this, since they 
may well use a value.

> CQL support for compound columns
> 
>
> Key: CASSANDRA-2474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Eric Evans
>Assignee: Pavel Yaskevich
>  Labels: cql
> Fix For: 1.1
>
> Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, 
> 2474-transposed-select-no-sparse.PNG, 2474-transposed-select.PNG, 
> raw_composite.txt, screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-12-24 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175741#comment-13175741
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

Assuming we go with TRANSPOSED approach, this is what the HIVE DDL would look 
like:

{noformat}
CREATE EXTERNAL TABLE timeline(user_id string, tweet_id long, username string, 
timestamp long)
  STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
  WITH SERDEPROPERTIES ("transposed" = "true", "cassandra.column.mappings" 
= ":key")
{noformat}



> CQL support for compound columns
> 
>
> Key: CASSANDRA-2474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Eric Evans
>Assignee: Pavel Yaskevich
>  Labels: cql
> Fix For: 1.1
>
> Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, 
> 2474-transposed-select-no-sparse.PNG, 2474-transposed-select.PNG, 
> raw_composite.txt, screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3155) Secondary index should report it's memory consumption

2011-12-22 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175133#comment-13175133
 ] 

T Jake Luciani commented on CASSANDRA-3155:
---

attached a different approach.  The polymorphic approach would still be clunky 
because it includes self, so you need to get self + indexes.  latest patch 
seems more readable.

> Secondary index should report it's memory consumption
> -
>
> Key: CASSANDRA-3155
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3155
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Jason Rutherglen
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 1.0.7
>
> Attachments: v1-0001-CASSANDRA-3155-report-all-live-index-memory.txt
>
>
> Non-CFS backed secondary indexes will consume RAM which should be reported 
> back to Cassandra to be factored into it's flush by RAM amount.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-12-20 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173484#comment-13173484
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

bq.  Wouldn't you need body in the composite list too?

That would be the value of the composite column

> CQL support for compound columns
> 
>
> Key: CASSANDRA-2474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Eric Evans
>Assignee: Pavel Yaskevich
>  Labels: cql
> Fix For: 1.1
>
> Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, 
> 2474-transposed-select-no-sparse.PNG, 2474-transposed-select.PNG, 
> raw_composite.txt, screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-12-20 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173294#comment-13173294
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

One possibility that could avoid SPARSE/DENSE syntax would be:

{code:sql}
--DENSE transposed format
--column names are used directly in comparator

CREATE TABLE msg (
user text primary key,
sender text,
thread text,
tid int,
bodytext
)
WITH comparator = composite(sender,thread,tid);
{code}


{code:sql}

--SPARSE transposed format
--composite comparator with no specified columns will be a dynamic comparator
--and includes the column name as part of the dynamic column name
CREATE TABLE msg (
user text primary key,
sender text,
thread text,
tid int,
body   text
)
WITH comparator = composite;
{code}

{code:sql}
--Finally in the case of both SPARSE/DENSE
CREATE TABLE timeline (
userid int primary key,
posted_at uuid,
column string,
value blob
)
WITH comparator = composite(posted_at,*);
{code}





> CQL support for compound columns
> 
>
> Key: CASSANDRA-2474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Eric Evans
>Assignee: Pavel Yaskevich
>  Labels: cql
> Fix For: 1.1
>
> Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, 
> 2474-transposed-select-no-sparse.PNG, 2474-transposed-select.PNG, 
> raw_composite.txt, screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3649) Code style changes, aka The Big Reformat

2011-12-20 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173264#comment-13173264
 ] 

T Jake Luciani commented on CASSANDRA-3649:
---

bq. My problem is that I don't see any benefits

maintaining a custom codestyle just for cassandra sucks.  If we simply adopted 
java standard then new contributors and people who work on multiple java 
projects don't need to worry about codestyle since it's the "default".

> Code style changes, aka The Big Reformat
> 
>
> Key: CASSANDRA-3649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3649
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Brandon Williams
> Fix For: 1.2
>
>
> With a new major release coming soon and not having a ton of huge pending 
> patches that have prevented us from doing this in the past, post-freeze looks 
> like a good time to finally do this.  Mostly this will include the removal of 
> underscores in private variables, and no more brace-on-newline policy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-12-19 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172437#comment-13172437
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

ok +1

On the Hive side we can support the same semantics, however achieving the same 
syntax will be hard.  On the other hand since this is now DDL and no longer DML 
I think it's not a big deal.



> CQL support for compound columns
> 
>
> Key: CASSANDRA-2474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Eric Evans
>Assignee: Pavel Yaskevich
>  Labels: cql
> Fix For: 1.1
>
> Attachments: screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-12-19 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172416#comment-13172416
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

bq. But that's exactly when you do know what columns you have.

In my example, when a new column is added to the file and inserted by the 
loader, it's hidden from view till someone explicitly adds it as a sparse 
column.  That makes us no longer schemaless.  

bq. Nested-but-not-transposed data aka "documents" is another separate case.
 
This is the case I'm thinking of then.  Would this be handled in CQL or a 
"document" api?


> CQL support for compound columns
> 
>
> Key: CASSANDRA-2474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Eric Evans
>Assignee: Pavel Yaskevich
>  Labels: cql
> Fix For: 1.1
>
> Attachments: screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3632) using an ant builder in Eclipse is painful

2011-12-19 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172339#comment-13172339
 ] 

T Jake Luciani commented on CASSANDRA-3632:
---

+1!

> using an ant builder in Eclipse is painful
> --
>
> Key: CASSANDRA-3632
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3632
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging, Tools
>Affects Versions: 1.0.6
>Reporter: Eric Evans
>Assignee: Eric Evans
>Priority: Minor
> Attachments: 
> v1-0001-CASSANDRA-3632-remove-ant-builder-restore-java-builder.txt
>
>
> The {{generate-eclipse-files}} target creates project files that use an Ant 
> builder.  Besides being painfully slow (I've had the runs stack up behind 
> frequent saves), many of Eclipses errors and warnings do not show unless an 
> internal builder is used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-12-19 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172317#comment-13172317
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

bq. User will be able to add sparse columns using "ALTER TABLE" command.

Requiring a ALTER when you may not know what columns you have is too 
restrictive.  Example, a ETL from a 3rd party manufacturer that provides a 
custom set of attributes per product: some standard (Unit Price, Model, Color, 
etc) some specific (DPI, Shipping Size, Contrast Ratio).  We don't want to go 
back to having to know exactly what your data will look like before you can 
write/read it.  That's one of the important tenants of nosql I'd like to keep :)

bq. Can you elaborate "raw" mode?

I mean non-transposed mode.


> CQL support for compound columns
> 
>
> Key: CASSANDRA-2474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Eric Evans
>Assignee: Pavel Yaskevich
>  Labels: cql
> Fix For: 1.1
>
> Attachments: screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-12-19 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172299#comment-13172299
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

In the non-sparse case you always would always ignore the column value? I think 
we need to expose that somehow. (first non-transposed, non-key, non-sparse 
column?)

Overall I like this because it forces a user to think at schema creation time 
and not access time. This approach makes sense for CQL only access, but for 
users who are coming from thrift they will be asking "how do i access data from 
my current data model?"

On the negative side, this approach feels a bit too restrictive since you 
*MUST* use the same kind of schema across all rows within a CF.  What if a user 
doesn't know what the sparse columns will be ahead of time?  

Also, I know that's best practice but want to make the point, what if a user 
wants to access data in composite form and "raw" mode, should we support 
multiple "views" on the CF?



> CQL support for compound columns
> 
>
> Key: CASSANDRA-2474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Eric Evans
>Assignee: Pavel Yaskevich
>  Labels: cql
> Fix For: 1.1
>
> Attachments: screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3554) Hints are not replayed unless node was marked down

2011-12-05 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162877#comment-13162877
 ] 

T Jake Luciani commented on CASSANDRA-3554:
---

Right, however its never going to know if hints are on a coordinator node due 
to the coordinator needed to drop some messages (backpressure?)

So either the clients can poll all nodes slowly and fetch hints or we perhaps 
gossip hints available flag so nodes know when hints are there to read?

> Hints are not replayed unless node was marked down
> --
>
> Key: CASSANDRA-3554
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3554
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>
> If B drops a write from A because it is overwhelmed (but not dead), A will 
> hint the write.  But it will never get notified that B is back up (since it 
> was never down), so it will never attempt hint delivery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3573) When Snappy compression is not available on the platform, trying to enable it introduces problems

2011-12-05 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162831#comment-13162831
 ] 

T Jake Luciani commented on CASSANDRA-3573:
---

If you build snappy-java directly on those machines you can tell snappy to use 
that library with the following:

cassandra -Djava.library.path=(path to the installed snappyjava lib) 
-Dorg.xerial.snappy.use.systemlib=true

> When Snappy compression is not available on the platform, trying to enable it 
> introduces problems
> -
>
> Key: CASSANDRA-3573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.5
> Environment: FreeBSD
>Reporter: Vitalii Tymchyshyn
>
> I've tried to enable compression for some column families in my cluster using 
> Snappy compression.
> It does not work and I am having problems with schema updates to remove it (a 
> lot of UNREACHABLE nodes during scema update).
> In log I have the next:
> ERROR [FlushWriter:961] 2011-12-05 17:16:33,383 AbstractCassandraDaemon.java 
> (line 133) Fatal exception in thread Thread[Flu
> shWriter:961,5,main]
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.xerial.snappy.Snappy
> at 
> org.apache.cassandra.io.compress.SnappyCompressor.initialCompressedBufferLength(SnappyCompressor.java:39)
> at 
> org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:63)
> at 
> org.apache.cassandra.io.compress.CompressedSequentialWriter.open(CompressedSequentialWriter.java:34)
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.(SSTableWriter.java:91)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.createFlushWriter(ColumnFamilyStore.java:1850)
> at 
> org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:250)
> at org.apache.cassandra.db.Memtable.access$400(Memtable.java:47)
> at org.apache.cassandra.db.Memtable$4.runMayThrow(Memtable.java:291)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:679)
> It looks like Snappy can't initialize because it does not have native library 
> for my platform. It would be great if:
> 1) A check be done on schema update if Snappy can be used
> 2) If it is enabled and can't be used it would still work without compression 
> writes (but may be outputting some errors to indicate the situation)
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3554) Hints are not replayed unless node was marked down

2011-12-05 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162824#comment-13162824
 ] 

T Jake Luciani commented on CASSANDRA-3554:
---

The other problem is even if we fix the replay issue it's still terribly slow 
due to excessive throttling

I like the idea of changing from a push to pull mode for hint delivery. Similar 
to how mysql replication is client pull.  Clients know how swamped they are and 
can throttle their own delivery.




> Hints are not replayed unless node was marked down
> --
>
> Key: CASSANDRA-3554
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3554
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>
> If B drops a write from A because it is overwhelmed (but not dead), A will 
> hint the write.  But it will never get notified that B is back up (since it 
> was never down), so it will never attempt hint delivery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3554) Hints are not replayed unless node was marked down

2011-12-02 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161705#comment-13161705
 ] 

T Jake Luciani commented on CASSANDRA-3554:
---

Can we keep a running tally of hints per endpoint when they are written, when 
they reach a threshold we deliver them? + hourly scan :)

> Hints are not replayed unless node was marked down
> --
>
> Key: CASSANDRA-3554
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3554
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>
> If B drops a write from A because it is overwhelmed (but not dead), A will 
> hint the write.  But it will never get notified that B is back up (since it 
> was never down), so it will never attempt hint delivery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3457) Make cqlsh look for a suitable python version

2011-11-28 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158650#comment-13158650
 ] 

T Jake Luciani commented on CASSANDRA-3457:
---

tested +1

> Make cqlsh look for a suitable python version
> -
>
> Key: CASSANDRA-3457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3457
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Affects Versions: 1.0.3
>Reporter: paul cannon
>Assignee: paul cannon
>Priority: Minor
>  Labels: cqlsh
> Fix For: 1.0.5
>
> Attachments: 3457.patch.txt
>
>
> On RHEL 5, which I guess we still want to support, the default "python" in 
> the path is still 2.4. cqlsh does use a fair number of python features 
> introduced in 2.5, like collections.defaultdict, functools.partial, 
> generators. We can require RHEL 5 users to install a later python from EPEL, 
> but we'd have to call it as 'python2.5', or 'python2.6', etc.
> So rather than take the time to vet everything against python2.4, we may want 
> to make a wrapper script for cqlsh that checks for the existence of 
> python2.7, 2.6, and 2.5, and calls the appropriate one to run the real cqlsh.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-28 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158469#comment-13158469
 ] 

T Jake Luciani commented on CASSANDRA-1391:
---

If you remove avro how do people upgrade?

> Allow Concurrent Schema Migrations
> --
>
> Key: CASSANDRA-1391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Stu Hood
>Assignee: Pavel Yaskevich
> Fix For: 1.1
>
> Attachments: 
> 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
> 0002-avro-removal.patch, CASSANDRA-1391.patch
>
>
> CASSANDRA-1292 fixed multiple migrations started from the same node to 
> properly queue themselves, but it is still possible for migrations initiated 
> on different nodes to conflict and leave the cluster in a bad state. Since 
> the system_add/drop/rename methods are accessible directly from the client 
> API, they should be completely safe for concurrent use.
> It should be possible to allow for most types of concurrent migrations by 
> converting the UUID schema ID into a VersionVectorClock (as provided by 
> CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1684) Entity groups

2011-11-23 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156133#comment-13156133
 ] 

T Jake Luciani commented on CASSANDRA-1684:
---

bq. Do we really need row groups now that we can have arbitrary nesting within 
a row via composite columns?

What about secondary indexes?  Unless we add composite secondary indexes.


> Entity groups
> -
>
> Key: CASSANDRA-1684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Sylvain Lebresne
> Fix For: 1.1
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be 
> part of a parent "entity group," whose key is used for routing instead of the 
> row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica 
> even with RandomPartitioner

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-14 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149743#comment-13149743
 ] 

T Jake Luciani commented on CASSANDRA-1391:
---

My moving to CF based migration logic it would be very useful to have the logic 
abstracted so it can be used for other use cases.

Migrations give you the following:

  * RF = N where N is the size of the ring.
  * All changes are "pushed" to new nodes when they join the ring.
  * previously sent data is available locally on startup



> Allow Concurrent Schema Migrations
> --
>
> Key: CASSANDRA-1391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Stu Hood
>Assignee: Pavel Yaskevich
> Fix For: 1.1
>
> Attachments: CASSANDRA-1391.patch
>
>
> CASSANDRA-1292 fixed multiple migrations started from the same node to 
> properly queue themselves, but it is still possible for migrations initiated 
> on different nodes to conflict and leave the cluster in a bad state. Since 
> the system_add/drop/rename methods are accessible directly from the client 
> API, they should be completely safe for concurrent use.
> It should be possible to allow for most types of concurrent migrations by 
> converting the UUID schema ID into a VersionVectorClock (as provided by 
> CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-11 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148656#comment-13148656
 ] 

T Jake Luciani commented on CASSANDRA-1391:
---

I think the patch is a good start but I do like Jonathans idea of moving to 
native CFs too.

We just want to make sure this gets into 1.1 since it's a problem a lot of 
people run into.

Regarding the current impl, my concern is missing fields added to migration 
structs over time. like we had happen a lot in CFMetaData conversion code.

Could you add a test verifies all migration struct fields are accounted for in 
the merge logic? so if someone adds a new field and doesn't update the 
migration merge logic it would cause this test to fail



> Allow Concurrent Schema Migrations
> --
>
> Key: CASSANDRA-1391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Stu Hood
>Assignee: Pavel Yaskevich
> Fix For: 1.1
>
> Attachments: CASSANDRA-1391.patch
>
>
> CASSANDRA-1292 fixed multiple migrations started from the same node to 
> properly queue themselves, but it is still possible for migrations initiated 
> on different nodes to conflict and leave the cluster in a bad state. Since 
> the system_add/drop/rename methods are accessible directly from the client 
> API, they should be completely safe for concurrent use.
> It should be possible to allow for most types of concurrent migrations by 
> converting the UUID schema ID into a VersionVectorClock (as provided by 
> CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3450) maybeInit in ColumnFamilyRecordReader can cause rows to be empty but not null

2011-11-08 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146310#comment-13146310
 ] 

T Jake Luciani commented on CASSANDRA-3450:
---

Reverted this change due to a bug related to single node test failures.  Going 
to attempt a fresh fix at this and CASSANDRA-2855  

> maybeInit in ColumnFamilyRecordReader can cause rows to be empty but not null
> -
>
> Key: CASSANDRA-3450
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3450
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.7, 1.0.1
>Reporter: Lanny Ripple
>Assignee: Lanny Ripple
> Fix For: 0.8.8, 1.0.3
>
> Attachments: v1-0001-CASSANDRA-3450.txt
>
>
> 1) In {{ColumnFamilyRecordReader}} {{isPredicateEmpty}} needs bracing to 
> correctly place the {{else if}} to the properly controlling {{if}}.
> 1a) {{isPredicateEmpty}} should use an || in the getSlice_range predicate 
> rather than &&.
> 2) In {{ColumnFamilyRecordReader}} {{computeNext()}} calls {{maybeInit()}} 
> and then if {{ros}} is not null it is indexed into.  {{maybeInit()}} could 
> fetch new data, determine the associated slice predicate is empty, and end up 
> removing all the rows if all columns turned out to be empty.  There is no 
> check for {{rows.isEmpty()}} after the possible removal of all rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2855) Skip rows with empty columns when slicing entire row

2011-11-08 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146313#comment-13146313
 ] 

T Jake Luciani commented on CASSANDRA-2855:
---

Reverted will submit a new patch

> Skip rows with empty columns when slicing entire row
> 
>
> Key: CASSANDRA-2855
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2855
> Project: Cassandra
>  Issue Type: Improvement
>  Components: API
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Minor
>  Labels: hadoop
> Fix For: 0.8.8
>
> Attachments: 2855-v2.txt, 2855-v3.txt, 2855-v4.txt, 2855-v5.txt
>
>
> We have been finding that range ghosts appear in results from Hadoop via Pig. 
>  This could also happen if rows don't have data for the slice predicate that 
> is given.  This leads to having to do a painful amount of defensive checking 
> on the Pig side, especially in the case of range ghosts.
> We would like to add an option to skip rows that have no column values in it. 
>  That functionality existed before in core Cassandra but was removed because 
> of the performance penalty of that checking.  However with Hadoop support in 
> the RecordReader, that is batch oriented anyway, so individual row reading 
> performance isn't as much of an issue.  Also we would make it an optional 
> config parameter for each job anyway, so people wouldn't have to incur that 
> penalty if they are confident that there won't be those empty rows or they 
> don't care.
> It could be parameter cassandra.skip.empty.rows and be true/false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-913) Add Hive support

2011-11-08 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146304#comment-13146304
 ] 

T Jake Luciani commented on CASSANDRA-913:
--

Hi Nicholas,

Thanks for looking at this. As you mention we need to figure out how to get the 
tests working locally. This probably requires the hive test artifacts to be 
deployed in maven.

We are currently using the cassandra-1.0 branch on github so that should have 
the latest changes.  Cassandra 1.1 will be upgrading to thrift 0.7 
CASSANDRA-3213 at which point we should work with Hive 0.8 without conflicts.



> Add Hive support
> 
>
> Key: CASSANDRA-913
> URL: https://issues.apache.org/jira/browse/CASSANDRA-913
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Contrib
>Reporter: Jonathan Ellis
>  Labels: gsoc, gsoc2010
> Attachments: CASSANDRA-913-r1199213.patch
>
>
> http://hadoop.apache.org/hive/ is a project that runs SQL queries against 
> Hadoop map/reduce clusters.  (For analytics; it is too high-latency to run 
> applications against Hive directly).  HIVE-705 added support for backends 
> other than HDFS, with HBase as the first.  Cassandra support should be doable 
> too now.
> The Hive storage backends are described in 
> http://wiki.apache.org/hadoop/Hive/StorageHandlers and the HBase backend 
> specifically in http://wiki.apache.org/hadoop/Hive/HBaseIntegration.
> I also note that John Sichi, author of the HBase backend, seems like a 
> helpful guy and I imagine would be totally cool with answering questions 
> about implementation details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2495) Add a proper retry mechanism for counters in case of failed request

2011-11-03 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143236#comment-13143236
 ] 

T Jake Luciani commented on CASSANDRA-2495:
---

CASSANDRA-2034 does improve this a bit, since we know a hint was stored for the 
timed out replica(s).

> Add a proper retry mechanism for counters in case of failed request
> ---
>
> Key: CASSANDRA-2495
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2495
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.8 beta 1
>Reporter: Sylvain Lebresne
> Attachments: marker_idea.txt
>
>
> Contrarily to standard insert, counter increments are not idempotent. As 
> such, replaying a counter mutation when a TimeoutException occurs could lead 
> to an over-count. This alone limits the use cases for which counters are a 
> viable solution, so we should try to come up with a mechanism that allow the 
> replay of a failed counter mutation without the risk of over-count. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3264) Add wide row paging for ColumnFamilyInputFormat and ColumnFamilyOutputFormat

2011-11-01 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141194#comment-13141194
 ] 

T Jake Luciani commented on CASSANDRA-3264:
---

This isn't a transpose!

This is paging  Maybe an example will help...

if you set column.paging.size = 3

RowA => [col1, col2, col3,..., colN]

Would become:

RowA => [col1, col2, col3]
RowA => [col4, col5, col6]
RowA => [col7, col8, col9]




> Add wide row paging for ColumnFamilyInputFormat and ColumnFamilyOutputFormat
> 
>
> Key: CASSANDRA-3264
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3264
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Hadoop
>Reporter: T Jake Luciani
>  Labels: lhf
> Fix For: 1.0.2
>
>
> Hadoop input/output formats currently can OOM on wide rows.
> We can add a new option to the ConfigHelper like columnPagingSize with a 
> default of Integer.MAX_VALUE.
> The input format would page the row internally rather than pull it over at 
> once.
> The output format could also use this to avoid sending huge rows over at once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3428) add constituent tracking to sstables

2011-10-31 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140247#comment-13140247
 ] 

T Jake Luciani commented on CASSANDRA-3428:
---

bq. the snapshot files plus incrementals from after the last full snapshot (up 
to point-in-time, if desired) give you exactly what you want, no more, no less.


Maybe I'm thinking about this wrong but If I was going to backup data in 
cassandra I would never run nodetool snapshot.  I would only enable incremental 
backup and remote backup the sstable and remove what's been backed up. 
I could then get to any point in time.  

You are saying I should cron snapshot the cluster then keep the incremental 
between..  I think with the feature I'm suggesting this wouldn't be necessary 
and IMO be less data to backup in the end.



> add constituent tracking to sstables
> 
>
> Key: CASSANDRA-3428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3428
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: T Jake Luciani
>  Labels: compaction
> Fix For: 1.1
>
>
> Compaction merges older sstables into newer versions of the data.
> When snapshotting sstables (esp incrementally) it would be very useful to 
> know what older sstables are no longer needed because they are now 
> represented in a newer version.
> This patch should add the list of sstables that made up each new sstable and 
> store this info in the -Statistics file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3428) add constituent tracking to sstables

2011-10-31 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140224#comment-13140224
 ] 

T Jake Luciani commented on CASSANDRA-3428:
---

That's not what I'm saying.

When "incremental_backup: true" then sstables are hard linked you end up with a 
directory full of sstables including ones that have been compacted into newer 
versions of the data.

If you want to restore from a backup in this scenario you need to load all the 
sstables then compact.  
If we had constituent data stored in the sstables of what sstables were used to 
create them then you could programmatically figure out what sstables we need to 
use to get a complete optimal snapshot.

It would also be handy to track this information anyway in the case of 
corruption of a sstable you could inspect the meta-data and get the list of 
sstables to retrieve from backup to fix *just* the corrupt file.

> add constituent tracking to sstables
> 
>
> Key: CASSANDRA-3428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3428
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: T Jake Luciani
>  Labels: compaction
> Fix For: 1.1
>
>
> Compaction merges older sstables into newer versions of the data.
> When snapshotting sstables (esp incrementally) it would be very useful to 
> know what older sstables are no longer needed because they are now 
> represented in a newer version.
> This patch should add the list of sstables that made up each new sstable and 
> store this info in the -Statistics file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3428) add constituent tracking to sstables

2011-10-31 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140216#comment-13140216
 ] 

T Jake Luciani commented on CASSANDRA-3428:
---

But you will for incremental snapshots.  How do you know what versions to load 
of the sstables?  Right now you must load all previous versions.

> add constituent tracking to sstables
> 
>
> Key: CASSANDRA-3428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3428
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: T Jake Luciani
>  Labels: compaction
> Fix For: 1.1
>
>
> Compaction merges older sstables into newer versions of the data.
> When snapshotting sstables (esp incrementally) it would be very useful to 
> know what older sstables are no longer needed because they are now 
> represented in a newer version.
> This patch should add the list of sstables that made up each new sstable and 
> store this info in the -Statistics file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1311) Triggers

2011-10-28 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138505#comment-13138505
 ] 

T Jake Luciani commented on CASSANDRA-1311:
---


bq. Potential crack smoking ahead:

I think as long as we keep the original timestamps this should work out...
The cost of storing the batch in a CF is def prohibitive but I can see at least 
how it can recover.


bq. keyed by coordinator node id+; column name some kind of uuid

If the coordinator dies who will complete the batch?  Would you manually need 
to re-assign the node id to another node?


> Triggers
> 
>
> Key: CASSANDRA-1311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1311
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Maxim Grinev
> Fix For: 1.1
>
> Attachments: HOWTO-PatchAndRunTriggerExample-update1.txt, 
> HOWTO-PatchAndRunTriggerExample.txt, ImplementationDetails-update1.pdf, 
> ImplementationDetails.pdf, trunk-967053.txt, trunk-984391-update1.txt, 
> trunk-984391-update2.txt
>
>
> Asynchronous triggers is a basic mechanism to implement various use cases of 
> asynchronous execution of application code at database side. For example to 
> support indexes and materialized views, online analytics, push-based data 
> propagation.
> Please find the motivation, triggers description and list of applications:
> http://maxgrinev.com/2010/07/23/extending-cassandra-with-asynchronous-triggers/
> An example of using triggers for indexing:
> http://maxgrinev.com/2010/07/23/managing-indexes-in-cassandra-using-async-triggers/
> Implementation details are attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3302) stop Cassandra result in hang

2011-10-27 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13137052#comment-13137052
 ] 

T Jake Luciani commented on CASSANDRA-3302:
---

+1

> stop Cassandra result in hang
> -
>
> Key: CASSANDRA-3302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3302
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.0.1
>Reporter: Jackson Chung
>Assignee: Sylvain Lebresne
> Fix For: 1.0.1
>
> Attachments: 3302.patch
>
>
> testing this under trunk via a hacked package (replacing jars from 0.8.6 deb 
> installation)
> When calling service cassandra stop, the Cassandra process hang:
> http://aep.appspot.com/display/i6aIUCkt4kz0HG5l2VszMM7QvLo/
> The following logs is observed in the C* log:
>  INFO [main] 2011-10-03 23:20:46,434 AbstractCassandraDaemon.java (line 270) 
> Cassandra shutting down...
>  INFO [main] 2011-10-03 23:20:46,434 CassandraDaemon.java (line 218) Stop 
> listening to thrift clients
> Re-run this using 1.0.0 branch, (following the same "hack" procedure), C* 
> stop properly, and the following is observed in the log:
>  INFO [main] 2011-10-04 05:02:08,048 AbstractCassandraDaemon.java (line 270) 
> Cassandra shutting down...
>  INFO [main] 2011-10-04 05:02:08,049 CassandraDaemon.java (line 218) Stop 
> listening to thrift clients
>  INFO [Thread-2] 2011-10-04 05:02:08,318 MessagingService.java (line 482) 
> Shutting down MessageService...
>  INFO [Thread-2] 2011-10-04 05:02:08,319 MessagingService.java (line 497) 
> Waiting for in-progress requests to complete
>  INFO [ACCEPT-/10.83.77.171] 2011-10-04 05:02:08,319 MessagingService.java 
> (line 637) MessagingService shutting down server thread.
> could this be related to CASSANDRA-3261 ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2611) static block in AbstractCassandraDaemon makes it difficult to change log4j behavoiur

2011-10-26 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135951#comment-13135951
 ] 

T Jake Luciani commented on CASSANDRA-2611:
---

As David mentioned CASSANDRA-3061 fixed this so if you don't pass 
defaultInitOverride the code isn't run.

> static block in AbstractCassandraDaemon makes it difficult to change log4j 
> behavoiur
> 
>
> Key: CASSANDRA-2611
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2611
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.4, 0.7.5, 1.0.0
> Environment: Windows 7
>Reporter: Paul Loy
>Assignee: Tommy Tynjä
>Priority: Minor
>  Labels: daemon, initialisation, log4j
> Attachments: CASSANDRA-2611-test.patch, CASSANDRA-2611.patch
>
>
> We embed Cassandra in our application - mainly because our webservices are 
> such a thin layer on top of Cassandra that it really does not make sense for 
> us to have Cassandra in an external JVM. In 0.7.0 this was all fine. Now 
> upgrading to 0.7.5, there is a static block in AbstractCassandraDaemon. This 
> gets called when the class is loaded causing us issues as we have not 
> generated the log4j.properties file at this point in time.
> Can this not be a protected method that is called when 
> AbstractCassandraDaemon is constructed? That way a) I can control the 
> behaviour and b) my log4j.properties file will have been generated by then.
> Thanks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3302) stop Cassandra result in hang

2011-10-24 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134099#comment-13134099
 ] 

T Jake Luciani commented on CASSANDRA-3302:
---

Jackson is this still happening? Or is it ok to close?

> stop Cassandra result in hang
> -
>
> Key: CASSANDRA-3302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3302
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.0.1
>Reporter: Jackson Chung
>Assignee: T Jake Luciani
> Fix For: 1.0.1
>
>
> testing this under trunk via a hacked package (replacing jars from 0.8.6 deb 
> installation)
> When calling service cassandra stop, the Cassandra process hang:
> http://aep.appspot.com/display/i6aIUCkt4kz0HG5l2VszMM7QvLo/
> The following logs is observed in the C* log:
>  INFO [main] 2011-10-03 23:20:46,434 AbstractCassandraDaemon.java (line 270) 
> Cassandra shutting down...
>  INFO [main] 2011-10-03 23:20:46,434 CassandraDaemon.java (line 218) Stop 
> listening to thrift clients
> Re-run this using 1.0.0 branch, (following the same "hack" procedure), C* 
> stop properly, and the following is observed in the log:
>  INFO [main] 2011-10-04 05:02:08,048 AbstractCassandraDaemon.java (line 270) 
> Cassandra shutting down...
>  INFO [main] 2011-10-04 05:02:08,049 CassandraDaemon.java (line 218) Stop 
> listening to thrift clients
>  INFO [Thread-2] 2011-10-04 05:02:08,318 MessagingService.java (line 482) 
> Shutting down MessageService...
>  INFO [Thread-2] 2011-10-04 05:02:08,319 MessagingService.java (line 497) 
> Waiting for in-progress requests to complete
>  INFO [ACCEPT-/10.83.77.171] 2011-10-04 05:02:08,319 MessagingService.java 
> (line 637) MessagingService shutting down server thread.
> could this be related to CASSANDRA-3261 ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3045) Update ColumnFamilyOutputFormat to use new bulkload API

2011-10-23 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133717#comment-13133717
 ] 

T Jake Luciani commented on CASSANDRA-3045:
---

We could write an alternate CFOF like BulkColumnFamilyOutputFormat that can be 
used when the TT is running on the same node as Cassandra.
The reducer would write files to hadoop.local.dir then when the reducer is 
closed it will contact the local cassandra instance via JMX with the output dir 
to be loaded into via streaming.



> Update ColumnFamilyOutputFormat to use new bulkload API
> ---
>
> Key: CASSANDRA-3045
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3045
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Hadoop
>Reporter: Jonathan Ellis
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 1.1
>
>
> The bulk loading interface added in CASSANDRA-1278 is a great fit for Hadoop 
> jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3372) Make HSHA cached threads.

2011-10-21 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132887#comment-13132887
 ] 

T Jake Luciani commented on CASSANDRA-3372:
---

+1

> Make HSHA cached threads.
> -
>
> Key: CASSANDRA-3372
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3372
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.8.3
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>  Labels: thrift
> Fix For: 1.0.1
>
> Attachments: 0001-update-to-cache-the-threads-for-tpe.patch
>
>
> JDK's newCachedTP does the following This is simillar to 
> ACD.CleaningThreadPool
> public static ExecutorService newCachedThreadPool(ThreadFactory 
> threadFactory) {
> return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
>   60L, TimeUnit.SECONDS,
>   new SynchronousQueue(),
>   threadFactory);
> }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3372) Make HSHA cached threads.

2011-10-21 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132841#comment-13132841
 ] 

T Jake Luciani commented on CASSANDRA-3372:
---

Why 60 Seconds I wonder?

Can you quantify the improvement %?

> Make HSHA cached threads.
> -
>
> Key: CASSANDRA-3372
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3372
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.8.3
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>  Labels: thrift
> Fix For: 1.0.1
>
> Attachments: 0001-update-to-cache-the-threads-for-tpe.patch
>
>
> JDK's newCachedTP does the following This is simillar to 
> ACD.CleaningThreadPool
> public static ExecutorService newCachedThreadPool(ThreadFactory 
> threadFactory) {
> return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
>   60L, TimeUnit.SECONDS,
>   new SynchronousQueue(),
>   threadFactory);
> }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3374) CQL can't create column with compression or that use leveled compaction

2011-10-20 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132022#comment-13132022
 ] 

T Jake Luciani commented on CASSANDRA-3374:
---

There is also no way to specify a index_type

> CQL can't create column with compression or that use leveled compaction
> ---
>
> Key: CASSANDRA-3374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3374
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Sylvain Lebresne
>Assignee: Pavel Yaskevich
>Priority: Minor
> Fix For: 1.0.1
>
>
> Looking at CreateColumnFamilyStatement.java, it doesn't seem CQL can create 
> compressed column families, nor define a compaction strategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1034) Remove assumption that Key to Token is one-to-one

2011-10-12 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125883#comment-13125883
 ] 

T Jake Luciani commented on CASSANDRA-1034:
---

My view is a Key requires a Token in our system. I understand that you cant 
keep multiple keys from mapping to the same token, still I would have liked to 
see the code deal with Tokens with (optional) keys then a mix of keys and 
tokens.  I see now this idea is broken in the sense that sorting a list of 
tokens means different things depending on the context (partitioner bounds vs 
user defined range)


> Remove assumption that Key to Token is one-to-one
> -
>
> Key: CASSANDRA-1034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1034
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stu Hood
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.1
>
> Attachments: 
> 0001-Make-range-accept-both-Token-and-DecoratedKey.patch, 
> 0002-LengthPartitioner.patch, 1034-1-Generify-AbstractBounds-v3.patch, 
> 1034-2-Remove-assumption-that-token-and-keys-are-one-to-one-v3.patch, 
> 1034_v1.txt, CASSANDRA-1034.patch
>
>
> get_range_slices assumes that Tokens do not collide and converts a KeyRange 
> to an AbstractBounds. For RandomPartitioner, this assumption isn't safe, and 
> would lead to a very weird heisenberg.
> Converting AbstractBounds to use a DecoratedKey would solve this, because the 
> byte[] key portion of the DecoratedKey can act as a tiebreaker. 
> Alternatively, we could make DecoratedKey extend Token, and then use 
> DecoratedKeys in places where collisions are unacceptable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1034) Remove assumption that Key to Token is one-to-one

2011-10-12 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125781#comment-13125781
 ] 

T Jake Luciani commented on CASSANDRA-1034:
---

bq. A token is intrinsically a range, a segment on the ring. 

But the whole point of the ticket is to remove this concept. Are you saying 
that can't be guaranteed?

This should be possible by making a equals consider the token AND key.  The 
problem with CASSANDRA-1733 is sometimes we don't specify a key since we have 
have Min token and an intrinsic Max token.  

> Remove assumption that Key to Token is one-to-one
> -
>
> Key: CASSANDRA-1034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1034
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stu Hood
>Assignee: Pavel Yaskevich
>Priority: Minor
> Fix For: 1.1
>
> Attachments: 
> 0001-Make-range-accept-both-Token-and-DecoratedKey.patch, 
> 0002-LengthPartitioner.patch, 1034-1-Generify-AbstractBounds-v3.patch, 
> 1034-2-Remove-assumption-that-token-and-keys-are-one-to-one-v3.patch, 
> 1034_v1.txt, CASSANDRA-1034.patch
>
>
> get_range_slices assumes that Tokens do not collide and converts a KeyRange 
> to an AbstractBounds. For RandomPartitioner, this assumption isn't safe, and 
> would lead to a very weird heisenberg.
> Converting AbstractBounds to use a DecoratedKey would solve this, because the 
> byte[] key portion of the DecoratedKey can act as a tiebreaker. 
> Alternatively, we could make DecoratedKey extend Token, and then use 
> DecoratedKeys in places where collisions are unacceptable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1034) Remove assumption that Key to Token is one-to-one

2011-10-11 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125379#comment-13125379
 ] 

T Jake Luciani commented on CASSANDRA-1034:
---

@Sylvain This is all really confusing and I agree the core of the ticket is to 
make key->token 1:1

The core of the problem initially was explained in CASSANDRA-1733

bq. A Range object (which Hadoop splits generate) is start-exclusive. A Bounds 
object (which normal user scan queries generate) is start-inclusive.

So by making Token the only way to deal with keys it feels like a more 
consistent api.  Since Key can be null it needs to be Token that becomes the 
primary internal class. 

In your impl we now have DK, Token, RingPosition which too me is more confusing 
than having one Token class.




> Remove assumption that Key to Token is one-to-one
> -
>
> Key: CASSANDRA-1034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1034
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stu Hood
>Assignee: Pavel Yaskevich
>Priority: Minor
> Fix For: 1.1
>
> Attachments: 
> 0001-Make-range-accept-both-Token-and-DecoratedKey.patch, 
> 0002-LengthPartitioner.patch, 1034-1-Generify-AbstractBounds-v3.patch, 
> 1034-2-Remove-assumption-that-token-and-keys-are-one-to-one-v3.patch, 
> 1034_v1.txt, CASSANDRA-1034.patch
>
>
> get_range_slices assumes that Tokens do not collide and converts a KeyRange 
> to an AbstractBounds. For RandomPartitioner, this assumption isn't safe, and 
> would lead to a very weird heisenberg.
> Converting AbstractBounds to use a DecoratedKey would solve this, because the 
> byte[] key portion of the DecoratedKey can act as a tiebreaker. 
> Alternatively, we could make DecoratedKey extend Token, and then use 
> DecoratedKeys in places where collisions are unacceptable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1034) Remove assumption that Key to Token is one-to-one

2011-10-10 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13124468#comment-13124468
 ] 

T Jake Luciani commented on CASSANDRA-1034:
---

At first glancei  like this because it makes the Token first class and the key 
not required. cleaning up the code below.

{code}
-DecoratedKey startWith = new DecoratedKey(range.left, null);
-DecoratedKey stopAt = new DecoratedKey(range.right, null);
+Token startWith = range.left;
+Token stopAt = range.right;
{code}

> Remove assumption that Key to Token is one-to-one
> -
>
> Key: CASSANDRA-1034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1034
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stu Hood
>Assignee: Pavel Yaskevich
>Priority: Minor
> Fix For: 1.1
>
> Attachments: 
> 0001-Make-range-accept-both-Token-and-DecoratedKey.patch, 
> 0002-LengthPartitioner.patch, 1034-1-Generify-AbstractBounds-v3.patch, 
> 1034-2-Remove-assumption-that-token-and-keys-are-one-to-one-v3.patch, 
> 1034_v1.txt, CASSANDRA-1034.patch
>
>
> get_range_slices assumes that Tokens do not collide and converts a KeyRange 
> to an AbstractBounds. For RandomPartitioner, this assumption isn't safe, and 
> would lead to a very weird heisenberg.
> Converting AbstractBounds to use a DecoratedKey would solve this, because the 
> byte[] key portion of the DecoratedKey can act as a tiebreaker. 
> Alternatively, we could make DecoratedKey extend Token, and then use 
> DecoratedKeys in places where collisions are unacceptable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3286) Performance issue in ByteBufferUtil

2011-09-30 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118243#comment-13118243
 ] 

T Jake Luciani commented on CASSANDRA-3286:
---

fixed in new attachement

> Performance issue in ByteBufferUtil
> ---
>
> Key: CASSANDRA-3286
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3286
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 1.0.1
>
> Attachments: 
> v1-0001-CASSANDRA-3286-perf-improvement-fix-and-fbutilities-ca.txt
>
>
> Profiling 1.0 we can see ByteBufferUtil.compareUnsigned is slow.
> {code}
> Excl. Incl. Incl. Incl. Name 
> User CPU User CPU Sync Wait Sync Wait 
> sec. sec. sec. Count 
> 318.491 318.491 1.400 113786  
> 40.561 40.561 0. 0  
> 18.972 19.093 0. 0 @0xd949 () 
> 17.718 18.730 0. 0 sun.security.provider.MD5.implCompress(byte[], int) 
> 14.396 14.396 0. 0 __pthread_cond_signal 
> 8.908 8.908 0. 0 
> org.apache.cassandra.utils.ByteBufferUtil.compareUnsigned(java.nio.ByteBuffer,
>  java.nio.ByteBuffer) 
> 7.435 7.688 0. 0 __pthread_cond_timedwait 
> 7.127 7.182 0. 0 @0xd8c9 () 
> 7.072 7.072 0. 0 jbyte_disjoint_arraycopy 
> 6.764 39.065 0. 0 org.apache.cassandra.utils.ReducingIterator.computeNext() 
> 6.533 17.575 0. 0 
> java.util.concurrent.ConcurrentSkipListMap.doPut(java.lang.Object, 
> java.lang.Object, boolean) 
> 6.346 6.346 0. 0 com.sun.crypto.provider.SunJCE_c.a(byte[], int, byte[], int) 
> 5.378 5.433 0. 0 send 
> 4.861 6.643 0.000 1 
> org.apache.cassandra.utils.ByteBufferUtil.read(java.io.DataInput, int) 
> 4.410 9.260 0. 0 
> org.apache.commons.collections.iterators.CollatingIterator.least() 
> 4.355 4.355 0. 0 java.io.ByteArrayOutputStream.write(int) 
> 4.300 6.632 0. 0 java.io.ByteArrayOutputStream.write(byte[], int, int) 
> 3.827 30.190 0. 0 
> org.apache.cassandra.dht.RandomPartitioner.decorateKey(java.nio.ByteBuffer) 
> 3.783 23.954 0. 0 
> org.apache.cassandra.utils.FBUtilities.hash(java.nio.ByteBuffer[]) 
> 3.783 3.860 0. 0 @0xd486c () 
> 3.739 3.739 0. 0 clock_gettime
> {code}
> We can avoid the problem when the ByteBuffer has a backing array

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3286) Performance issue in ByteBufferUtil

2011-09-30 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118212#comment-13118212
 ] 

T Jake Luciani commented on CASSANDRA-3286:
---

If you look at FBUtilities.compareUnsigned it's not lengths it's offsets :(

> Performance issue in ByteBufferUtil
> ---
>
> Key: CASSANDRA-3286
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3286
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 1.0.1
>
> Attachments: v1-0001-CASSANDRA-3286-perf-improvement.txt
>
>
> Profiling 1.0 we can see ByteBufferUtil.compareUnsigned is slow.
> {code}
> Excl. Incl. Incl. Incl. Name 
> User CPU User CPU Sync Wait Sync Wait 
> sec. sec. sec. Count 
> 318.491 318.491 1.400 113786  
> 40.561 40.561 0. 0  
> 18.972 19.093 0. 0 @0xd949 () 
> 17.718 18.730 0. 0 sun.security.provider.MD5.implCompress(byte[], int) 
> 14.396 14.396 0. 0 __pthread_cond_signal 
> 8.908 8.908 0. 0 
> org.apache.cassandra.utils.ByteBufferUtil.compareUnsigned(java.nio.ByteBuffer,
>  java.nio.ByteBuffer) 
> 7.435 7.688 0. 0 __pthread_cond_timedwait 
> 7.127 7.182 0. 0 @0xd8c9 () 
> 7.072 7.072 0. 0 jbyte_disjoint_arraycopy 
> 6.764 39.065 0. 0 org.apache.cassandra.utils.ReducingIterator.computeNext() 
> 6.533 17.575 0. 0 
> java.util.concurrent.ConcurrentSkipListMap.doPut(java.lang.Object, 
> java.lang.Object, boolean) 
> 6.346 6.346 0. 0 com.sun.crypto.provider.SunJCE_c.a(byte[], int, byte[], int) 
> 5.378 5.433 0. 0 send 
> 4.861 6.643 0.000 1 
> org.apache.cassandra.utils.ByteBufferUtil.read(java.io.DataInput, int) 
> 4.410 9.260 0. 0 
> org.apache.commons.collections.iterators.CollatingIterator.least() 
> 4.355 4.355 0. 0 java.io.ByteArrayOutputStream.write(int) 
> 4.300 6.632 0. 0 java.io.ByteArrayOutputStream.write(byte[], int, int) 
> 3.827 30.190 0. 0 
> org.apache.cassandra.dht.RandomPartitioner.decorateKey(java.nio.ByteBuffer) 
> 3.783 23.954 0. 0 
> org.apache.cassandra.utils.FBUtilities.hash(java.nio.ByteBuffer[]) 
> 3.783 3.860 0. 0 @0xd486c () 
> 3.739 3.739 0. 0 clock_gettime
> {code}
> We can avoid the problem when the ByteBuffer has a backing array

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3264) Add wide row paging for ColumnFamilyInputFormat and ColumnFamilyOutputFormat

2011-09-29 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117765#comment-13117765
 ] 

T Jake Luciani commented on CASSANDRA-3264:
---

So it would be one Map per column? That would work too, but I think this is a 
big enough issue for folks running with timeseries type data that we should fix 
it before we jump to CQL.

> Add wide row paging for ColumnFamilyInputFormat and ColumnFamilyOutputFormat
> 
>
> Key: CASSANDRA-3264
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3264
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Hadoop
>Reporter: T Jake Luciani
>  Labels: lhf
> Fix For: 0.8.7
>
>
> Hadoop input/output formats currently can OOM on wide rows.
> We can add a new option to the ConfigHelper like columnPagingSize with a 
> default of Integer.MAX_VALUE.
> The input format would page the row internally rather than pull it over at 
> once.
> The output format could also use this to avoid sending huge rows over at once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3264) Add wide row paging for ColumnFamilyInputFormat and ColumnFamilyOutputFormat

2011-09-27 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115520#comment-13115520
 ] 

T Jake Luciani commented on CASSANDRA-3264:
---

The MR would get multiple rows rather than one big one.

> Add wide row paging for ColumnFamilyInputFormat and ColumnFamilyOutputFormat
> 
>
> Key: CASSANDRA-3264
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3264
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Hadoop
>Reporter: T Jake Luciani
>  Labels: lhf
> Fix For: 0.8.7
>
>
> Hadoop input/output formats currently can OOM on wide rows.
> We can add a new option to the ConfigHelper like columnPagingSize with a 
> default of Integer.MAX_VALUE.
> The input format would page the row internally rather than pull it over at 
> once.
> The output format could also use this to avoid sending huge rows over at once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-27 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115468#comment-13115468
 ] 

T Jake Luciani commented on CASSANDRA-3150:
---

bq. This is the split that's receiving new data (5-10k rows/second).

So how many memtables do you have at once and how many rows can fit in a 
memtable?  If you have large memtables and tiny rows that would throw 
getSplits() off since the splits are generated from SSTables only.

> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4, 0.8.5
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Fix For: 0.8.6
>
> Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
> fullscan-example1.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira