[jira] [Commented] (CASSANDRA-2915) Lucene based Secondary Indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093341#comment-13093341 ] Ed Anuff commented on CASSANDRA-2915: - +1 on having the ability to provide a conversion class for handling transformations from columns to Lucene documents. It's not uncommon for people to store objects serialized to JSON or other some other serialization format into columns. CQL will have to catch up with this practice at some point. > Lucene based Secondary Indexes > -- > > Key: CASSANDRA-2915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2915 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: T Jake Luciani >Assignee: Jason Rutherglen > Labels: secondary_index > > Secondary indexes (of type KEYS) suffer from a number of limitations in their > current form: >- Multiple IndexClauses only work when there is a subset of rows under the > highest clause >- One new column family is created per index this means 10 new CFs for 10 > secondary indexes > This ticket will use the Lucene library to implement secondary indexes as one > index per CF, and utilize the Lucene query engine to handle multiple index > clauses. Also, by using the Lucene we get a highly optimized file format. > There are a few parallels we can draw between Cassandra and Lucene. > Lucene indexes segments in memory then flushes them to disk so we can sync > our memtable flushes to lucene flushes. Lucene also has optimize() which > correlates to our compaction process, so these can be sync'd as well. > We will also need to correlate column validators to Lucene tokenizers, so the > data can be stored properly, the big win in once this is done we can perform > complex queries within a column like wildcard searches. > The downside of this approach is we will need to read before write since > documents in Lucene are written as complete documents. For random workloads > with lot's of indexed columns this means we need to read the document from > the index, update it and write it back. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2982) Refactor secondary index api
[ https://issues.apache.org/jira/browse/CASSANDRA-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082149#comment-13082149 ] Ed Anuff commented on CASSANDRA-2982: - Given this level of abstraction, I wonder if Jake's original suggestion about replacing the IndexType enum with a classname makes more sense? > Refactor secondary index api > > > Key: CASSANDRA-2982 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2982 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: T Jake Luciani >Assignee: T Jake Luciani > Fix For: 1.0 > > Attachments: 2982-v1.txt > > > Secondary indexes currently make some bad assumptions about the underlying > indexes. > 1. That they are always stored in other column families. > 2. That there is a unique index per column > In the case of CASSANDRA-2915 neither of these are true. The new api should > abstract the search concepts and allow any search api to plug in. > Once the code is refactored and basically pluggable we can remove the > IndexType enum and use class names similar to how we handle partitioners and > comparators. > Basic api is to add a SecondaryIndexManager that handles different index > types per CF and a SecondaryIndex base class that handles a particular type > implementation. > This requires major changes to ColumnFamilyStore and Table.IndexBuilder -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2684) IntergerType uses Thrift method that attempts to unsafely access backing array of ByteBuffer and fails
[ https://issues.apache.org/jira/browse/CASSANDRA-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2684: Attachment: 2684.txt > IntergerType uses Thrift method that attempts to unsafely access backing > array of ByteBuffer and fails > -- > > Key: CASSANDRA-2684 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2684 > Project: Cassandra > Issue Type: Bug >Reporter: Ed Anuff >Assignee: Ed Anuff > Attachments: 2684.txt > > > I get the following exception: > {noformat} > ERROR 13:27:38,153 Fatal exception in thread Thread[ReadStage:36,5,main] > java.lang.RuntimeException: java.lang.UnsupportedOperationException > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:680) > Caused by: java.lang.UnsupportedOperationException > at java.nio.ByteBuffer.array(ByteBuffer.java:940) > at > org.apache.thrift.TBaseHelper.byteBufferToByteArray(TBaseHelper.java:264) > at > org.apache.thrift.TBaseHelper.byteBufferToByteArray(TBaseHelper.java:251) > at > org.apache.cassandra.db.marshal.IntegerType.getString(IntegerType.java:136) > at > org.apache.cassandra.db.marshal.AbstractCompositeType.getString(AbstractCompositeType.java:131) > at org.apache.cassandra.db.Column.getString(Column.java:228) > at > org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:123) > at > org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:130) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1303) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1188) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1145) > at org.apache.cassandra.db.Table.getRow(Table.java:385) > at > org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:61) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:641) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) > ... 3 more > {noformat} > Tracing it down, I find that IntegerType's getString method() looks like this: > {code:title=IntegerType.java|borderStyle=solid} > public String getString(ByteBuffer bytes) > { > if (bytes == null) > return "null"; > if (bytes.remaining() == 0) > return "empty"; > return new > java.math.BigInteger(TBaseHelper.byteBufferToByteArray(bytes)).toString(10); > } > {code} > > TBaseHelper.byteBufferToByteArray() looks like this: > {code:title=TBaseHelper.java|borderStyle=solid} > public static byte[] byteBufferToByteArray(ByteBuffer byteBuffer) { > if (wrapsFullArray(byteBuffer)) { > return byteBuffer.array(); > } > byte[] target = new byte[byteBuffer.remaining()]; > byteBufferToByteArray(byteBuffer, target, 0); > return target; > } > public static boolean wrapsFullArray(ByteBuffer byteBuffer) { > return byteBuffer.hasArray() > && byteBuffer.position() == 0 > && byteBuffer.arrayOffset() == 0 > && byteBuffer.remaining() == byteBuffer.capacity(); > } > public static int byteBufferToByteArray(ByteBuffer byteBuffer, byte[] > target, int offset) { > int remaining = byteBuffer.remaining(); > System.arraycopy(byteBuffer.array(), > byteBuffer.arrayOffset() + byteBuffer.position(), > target, > offset, > remaining); > return remaining; > } > {code} > The second overloaded implementation of byteBufferToByteArray is calling the > bytebuffer's array() method. > Suggested fixes: > 1) Don't use TBaseHelper in IntegerType.getString(), use > ByteBufferUtil.getArray() > 2) Report problem upstream to Thrift. > 3) Find a better way to deserialize BigIntegers that doesn't require an array > copy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2684) IntergerType uses Thrift method that attempts to unsafely access backing array of ByteBuffer and fails
[ https://issues.apache.org/jira/browse/CASSANDRA-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2684: Description: I get the following exception: {noformat} ERROR 13:27:38,153 Fatal exception in thread Thread[ReadStage:36,5,main] java.lang.RuntimeException: java.lang.UnsupportedOperationException at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: java.lang.UnsupportedOperationException at java.nio.ByteBuffer.array(ByteBuffer.java:940) at org.apache.thrift.TBaseHelper.byteBufferToByteArray(TBaseHelper.java:264) at org.apache.thrift.TBaseHelper.byteBufferToByteArray(TBaseHelper.java:251) at org.apache.cassandra.db.marshal.IntegerType.getString(IntegerType.java:136) at org.apache.cassandra.db.marshal.AbstractCompositeType.getString(AbstractCompositeType.java:131) at org.apache.cassandra.db.Column.getString(Column.java:228) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:123) at org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:130) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1303) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1188) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1145) at org.apache.cassandra.db.Table.getRow(Table.java:385) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:61) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:641) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more {noformat} Tracing it down, I find that IntegerType's getString method() looks like this: {code:title=IntegerType.java|borderStyle=solid} public String getString(ByteBuffer bytes) { if (bytes == null) return "null"; if (bytes.remaining() == 0) return "empty"; return new java.math.BigInteger(TBaseHelper.byteBufferToByteArray(bytes)).toString(10); } {code} TBaseHelper.byteBufferToByteArray() looks like this: {code:title=TBaseHelper.java|borderStyle=solid} public static byte[] byteBufferToByteArray(ByteBuffer byteBuffer) { if (wrapsFullArray(byteBuffer)) { return byteBuffer.array(); } byte[] target = new byte[byteBuffer.remaining()]; byteBufferToByteArray(byteBuffer, target, 0); return target; } public static boolean wrapsFullArray(ByteBuffer byteBuffer) { return byteBuffer.hasArray() && byteBuffer.position() == 0 && byteBuffer.arrayOffset() == 0 && byteBuffer.remaining() == byteBuffer.capacity(); } public static int byteBufferToByteArray(ByteBuffer byteBuffer, byte[] target, int offset) { int remaining = byteBuffer.remaining(); System.arraycopy(byteBuffer.array(), byteBuffer.arrayOffset() + byteBuffer.position(), target, offset, remaining); return remaining; } {code} The second overloaded implementation of byteBufferToByteArray is calling the bytebuffer's array() method. Suggested fixes: 1) Don't use TBaseHelper in IntegerType.getString(), use ByteBufferUtil.getArray() 2) Report problem upstream to Thrift. 3) Find a better way to deserialize BigIntegers that doesn't require an array copy. was: I get the following exception: ERROR 13:27:38,153 Fatal exception in thread Thread[ReadStage:36,5,main] java.lang.RuntimeException: java.lang.UnsupportedOperationException at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: java.lang.UnsupportedOperationException at java.nio.ByteBuffer.array(ByteBuffer.java:940) at org.apache.thrift.TBaseHelper.byteBufferToByteArray(TBaseHelper.java:264) at org.apache.thrift.TBaseHelper.byteBufferToByteArray(TBaseHelper.java:251) at org.apache.cassandra.db.marshal.IntegerType.getString(IntegerType.java:136) at org.apache.cassandra.db.marshal.AbstractCompositeType.getString(AbstractCompositeType.java:131) at org.apache.cassandra.db.Column.getString(Column.java:228) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.j
[jira] [Created] (CASSANDRA-2684) IntergerType uses Thrift method that attempts to unsafely access backing array of ByteBuffer and fails
IntergerType uses Thrift method that attempts to unsafely access backing array of ByteBuffer and fails -- Key: CASSANDRA-2684 URL: https://issues.apache.org/jira/browse/CASSANDRA-2684 Project: Cassandra Issue Type: Bug Reporter: Ed Anuff I get the following exception: ERROR 13:27:38,153 Fatal exception in thread Thread[ReadStage:36,5,main] java.lang.RuntimeException: java.lang.UnsupportedOperationException at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: java.lang.UnsupportedOperationException at java.nio.ByteBuffer.array(ByteBuffer.java:940) at org.apache.thrift.TBaseHelper.byteBufferToByteArray(TBaseHelper.java:264) at org.apache.thrift.TBaseHelper.byteBufferToByteArray(TBaseHelper.java:251) at org.apache.cassandra.db.marshal.IntegerType.getString(IntegerType.java:136) at org.apache.cassandra.db.marshal.AbstractCompositeType.getString(AbstractCompositeType.java:131) at org.apache.cassandra.db.Column.getString(Column.java:228) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:123) at org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:130) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1303) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1188) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1145) at org.apache.cassandra.db.Table.getRow(Table.java:385) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:61) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:641) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more Tracing it down, I find that IntegerType's getString method() looks like this: public String getString(ByteBuffer bytes) { if (bytes == null) return "null"; if (bytes.remaining() == 0) return "empty"; return new java.math.BigInteger(TBaseHelper.byteBufferToByteArray(bytes)).toString(10); } TBaseHelper.byteBufferToByteArray() looks like this: public static byte[] byteBufferToByteArray(ByteBuffer byteBuffer) { if (wrapsFullArray(byteBuffer)) { return byteBuffer.array(); } byte[] target = new byte[byteBuffer.remaining()]; byteBufferToByteArray(byteBuffer, target, 0); return target; } public static boolean wrapsFullArray(ByteBuffer byteBuffer) { return byteBuffer.hasArray() && byteBuffer.position() == 0 && byteBuffer.arrayOffset() == 0 && byteBuffer.remaining() == byteBuffer.capacity(); } public static int byteBufferToByteArray(ByteBuffer byteBuffer, byte[] target, int offset) { int remaining = byteBuffer.remaining(); System.arraycopy(byteBuffer.array(), byteBuffer.arrayOffset() + byteBuffer.position(), target, offset, remaining); return remaining; } The second overloaded implementation of byteBufferToByteArray is calling the bytebuffer's array() method. Suggested fixes: 1) Don't use TBaseHelper in IntegerType.getString(), use ByteBufferUtil.getArray() 2) Report problem upstream to Thrift. 3) Find a better way to deserialize BigIntegers that doesn't require an array copy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2682) UUIDType assumes ByteBuffer has an accessible backing array
[ https://issues.apache.org/jira/browse/CASSANDRA-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2682: Attachment: (was: 2682.txt) > UUIDType assumes ByteBuffer has an accessible backing array > --- > > Key: CASSANDRA-2682 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2682 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 0.8 beta 1 >Reporter: Ed Anuff >Assignee: Ed Anuff > Attachments: 2682.txt > > > I'm very embarrassed to say this got left out in the UUIDType, but it's not > doing a hasArray() check on the bytebuffers passed to it, causing it to > break. I'll make a patch to fix it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2682) UUIDType assumes ByteBuffer has an accessible backing array
[ https://issues.apache.org/jira/browse/CASSANDRA-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2682: Attachment: 2682.txt New patch that uses ByteBuffer.get() instead of direct array accesses > UUIDType assumes ByteBuffer has an accessible backing array > --- > > Key: CASSANDRA-2682 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2682 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 0.8 beta 1 >Reporter: Ed Anuff >Assignee: Ed Anuff > Attachments: 2682.txt > > > I'm very embarrassed to say this got left out in the UUIDType, but it's not > doing a hasArray() check on the bytebuffers passed to it, causing it to > break. I'll make a patch to fix it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2682) UUIDType assumes ByteBuffer has an accessible backing array
[ https://issues.apache.org/jira/browse/CASSANDRA-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2682: Attachment: 2682.txt > UUIDType assumes ByteBuffer has an accessible backing array > --- > > Key: CASSANDRA-2682 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2682 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 0.8 beta 1 >Reporter: Ed Anuff >Assignee: Ed Anuff > Attachments: 2682.txt > > > I'm very embarrassed to say this got left out in the UUIDType, but it's not > doing a hasArray() check on the bytebuffers passed to it, causing it to > break. I'll make a patch to fix it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2682) UUIDType assumes ByteBuffer has an accessible backing array
UUIDType assumes ByteBuffer has an accessible backing array --- Key: CASSANDRA-2682 URL: https://issues.apache.org/jira/browse/CASSANDRA-2682 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ed Anuff Assignee: Ed Anuff I'm very embarrassed to say this got left out in the UUIDType, but it's not doing a hasArray() check on the bytebuffers passed to it, causing it to break. I'll make a patch to fix it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030100#comment-13030100 ] Ed Anuff commented on CASSANDRA-2231: - It probably should just be a passthrough like it is for the BytesType: @Override public ByteBuffer decompose(ByteBuffer value) { return value; } Assuming that's put in into AbstractCompositeType, it looks good to me. The unit tests are the same we've been using in the version of this we've been maintaining at https://github.com/riptano/hector-composite which some folks are using in production. > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: New Feature > Components: Contrib >Reporter: Ed Anuff >Assignee: Sylvain Lebresne >Priority: Minor > Fix For: 0.8.1 > > Attachments: > 0001-Add-compositeType-and-DynamicCompositeType-v2.patch, > 0001-Add-compositeType-and-DynamicCompositeType-v3.patch, > 0001-Add-compositeType-and-DynamicCompositeType_0.7.patch, > CompositeType-and-DynamicCompositeType.patch, > edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029651#comment-13029651 ] Ed Anuff commented on CASSANDRA-2231: - Am I correct in that the v3 rebased to .8 doesn't include the decompose() methods? I had to add those to get the v3 patch to build. Once I did that, it passed the unit tests. > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: New Feature > Components: Contrib >Reporter: Ed Anuff >Assignee: Sylvain Lebresne >Priority: Minor > Fix For: 0.8.1 > > Attachments: > 0001-Add-compositeType-and-DynamicCompositeType-v2.patch, > 0001-Add-compositeType-and-DynamicCompositeType-v3.patch, > 0001-Add-compositeType-and-DynamicCompositeType_0.7.patch, > CompositeType-and-DynamicCompositeType.patch, > edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029400#comment-13029400 ] Ed Anuff commented on CASSANDRA-2231: - Yes, will do it today. > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: New Feature > Components: Contrib >Reporter: Ed Anuff >Assignee: Sylvain Lebresne >Priority: Minor > Fix For: 0.8.1 > > Attachments: > 0001-Add-compositeType-and-DynamicCompositeType-v2.patch, > 0001-Add-compositeType-and-DynamicCompositeType-v3.patch, > CompositeType-and-DynamicCompositeType.patch, > edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019419#comment-13019419 ] Ed Anuff commented on CASSANDRA-2231: - If you can add the -1, 0, 1 e-o-c suggestion, then I'm good with this. We've been using a preview version of your patch that we put up at https://github.com/riptano/hector-composite to make sure it works. So far it seems to meet every use case, with the exception of the previous questions from April 4 which you correctly pointed out couldn't be solved the way we were suggesting. So, I'm all for getting this out as soon as possible. > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Assignee: Sylvain Lebresne >Priority: Minor > Fix For: 0.7.5 > > Attachments: CompositeType-and-DynamicCompositeType.patch, > edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017976#comment-13017976 ] Ed Anuff edited comment on CASSANDRA-2233 at 4/9/11 10:48 PM: -- That test is comparing a time-based and non-timed based UUID, I believe. My version of this was using: assertEquals(c, sign(compareUsingJUG(u1, u2))); Looks like you changed it to: +if (u1.version() == 1) +assertEquals(c, TimeUUIDType.instance.compare(bytebuffer(u1), bytebuffer(u2))); It needs to be this if you're testing compatibility with TimeUUIDType: +if ((u1.version() == 1) && (u2.version() == 1)) +assertEquals(c, TimeUUIDType.instance.compare(bytebuffer(u1), bytebuffer(u2))); FWIW, I saw you pulled the compareUsingJUG() test. The thinking there was to have additional coverage by testing with another comparison implementation. If we want to remove a dependency on JUG, that's fine, and, of course, there's nothing canonical about JUG, except Cassandra is already using it and it has a well thought out and documented comparison function that's compatible with this one. was (Author: edanuff): That test is comparing a time-based and non-timed based UUID, I believe. My version of this was using: assertEquals(c, sign(compareUsingJUG(u1, u2))); Looks like you changed it to: +if (u1.version() == 1) +assertEquals(c, TimeUUIDType.instance.compare(bytebuffer(u1), bytebuffer(u2))); It needs to be this if you're testing compatibility with TimeUUIDType: +if ((u1.version() == 1) && (u2.version() == 1)) +assertEquals(c, TimeUUIDType.instance.compare(bytebuffer(u1), bytebuffer(u2))); > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Assignee: Ed Anuff >Priority: Minor > Fix For: 0.8 > > Attachments: 2233.txt, UUIDType.java, UUIDTypeTest.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017976#comment-13017976 ] Ed Anuff commented on CASSANDRA-2233: - That test is comparing a time-based and non-timed based UUID, I believe. My version of this was using: assertEquals(c, sign(compareUsingJUG(u1, u2))); Looks like you changed it to: +if (u1.version() == 1) +assertEquals(c, TimeUUIDType.instance.compare(bytebuffer(u1), bytebuffer(u2))); It needs to be this if you're testing compatibility with TimeUUIDType: +if ((u1.version() == 1) && (u2.version() == 1)) +assertEquals(c, TimeUUIDType.instance.compare(bytebuffer(u1), bytebuffer(u2))); > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 2233.txt, UUIDType.java, UUIDTypeTest.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017969#comment-13017969 ] Ed Anuff edited comment on CASSANDRA-2233 at 4/9/11 10:08 PM: -- Hi Jonathan, thanks for testing it out. It's following the convention of [http://download.oracle.com/javase/6/docs/api/java/util/Comparator.html#compare%28T,%20T%29] to return a negative or positive value which is not necessarily -1 or +1. I can normalize it to those values if we want to hold to a tighter standard. What do you suggest? Edit: Oops, just saw you added a sign() method to do that already. was (Author: edanuff): Hi Jonathan, thanks for testing it out. It's following the convention of [http://download.oracle.com/javase/6/docs/api/java/util/Comparator.html#compare%28T,%20T%29] to return a negative or positive value which is not necessarily -1 or +1. I can normalize it to those values if we want to hold to a tighter standard. What do you suggest? > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 2233.txt, UUIDType.java, UUIDTypeTest.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017969#comment-13017969 ] Ed Anuff commented on CASSANDRA-2233: - Hi Jonathan, thanks for testing it out. It's following the convention of [http://download.oracle.com/javase/6/docs/api/java/util/Comparator.html#compare%28T,%20T%29] to return a negative or positive value which is not necessarily -1 or +1. I can normalize it to those values if we want to hold to a tighter standard. What do you suggest? > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 2233.txt, UUIDType.java, UUIDTypeTest.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015274#comment-13015274 ] Ed Anuff commented on CASSANDRA-2231: - One more thing that's come up in testing, we're finding that only the last end-of-component byte is allowed to be non-zero. In lines 231-233 of your path, you do this check in AbstractCompositeType.validate(): +byte b = bb.get(); +if (b != 0 && bb.remaining() != 0) +throw new MarshalException("Invalid bytes remaining after an end-of-component at component" + i); + Is this check necessary or would it be ok if the end-of-component byte could be non-zero in any component? We're usually doing the less-than-equals or greater-than-equals comparisons on the first component value, not the last, which is usually just a discriminator value to prevent duplicate column names. > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Assignee: Sylvain Lebresne >Priority: Minor > Fix For: 0.7.5 > > Attachments: CompositeType-and-DynamicCompositeType.patch, > edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012531#comment-13012531 ] Ed Anuff commented on CASSANDRA-2231: - Sylvain, in the JPA implementation, we're seeing that we'd like to have a little more flexibility with the trailing end-of-component, specifically, that it be able to have values of -1,0,1 rather than just 0,1. The comparison logic would look like this: {noformat} byte b1 = bb1.get(); byte b2 = bb2.get(); if (b1 < 0) { if (b2 >= 0) { return -1; } } if (b1 > 0) { if (b2 <= 0) { return 1; } } if ((b1 == 0) && (b2 != 0)) { return - b2; } {noformat} > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Assignee: Sylvain Lebresne >Priority: Minor > Fix For: 0.7.5 > > Attachments: CompositeType-and-DynamicCompositeType.patch, > edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2379) ByteBufferUtil#bytes(String) can produce undesired results for some characters
[ https://issues.apache.org/jira/browse/CASSANDRA-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010785#comment-13010785 ] Ed Anuff edited comment on CASSANDRA-2379 at 3/24/11 5:29 PM: -- Actually, it may a good idea to avoid using Charset.defaultCharset() anywhere in ByteBufferUtil and probably elsewhere as well. On the Mac, at least, that's going to be "MacRoman" and on all platforms may change due to the settings of the system "file.encoding" property. Shouldn't we be making sure we're using UTF8? was (Author: edanuff): Actually, it may a good idea to avoid using Charset.defaultCharset(). On the Mac, at least, that's going to be "MacRoman" and on all platforms may change due to the settings of the system "file.encoding" property. Shouldn't we be making sure we're using UTF8? > ByteBufferUtil#bytes(String) can produce undesired results for some characters > -- > > Key: CASSANDRA-2379 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2379 > Project: Cassandra > Issue Type: Bug >Reporter: Nate McCall > Attachments: 2379.txt > > > The difference between getBytes(java.nio.charset.Charset) vs. > getBytes("[charsetname]") on some platforms (mac it seems) can be > substantial. From the java.lang.String javadoc for the former: > This method always replaces malformed-input and unmappable-character > sequences with this charset's default replacement byte array... > vs. the latter: > The behavior of this method when this string cannot be encoded in the default > charset is unspecified. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2379) ByteBufferUtil#bytes(String) can produce undesired results for some characters
[ https://issues.apache.org/jira/browse/CASSANDRA-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010785#comment-13010785 ] Ed Anuff commented on CASSANDRA-2379: - Actually, it may a good idea to avoid using Charset.defaultCharset(). On the Mac, at least, that's going to be "MacRoman" and on all platforms may change due to the settings of the system "file.encoding" property. Shouldn't we be making sure we're using UTF8? > ByteBufferUtil#bytes(String) can produce undesired results for some characters > -- > > Key: CASSANDRA-2379 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2379 > Project: Cassandra > Issue Type: Bug >Reporter: Nate McCall > Attachments: 2379.txt > > > The difference between getBytes(java.nio.charset.Charset) vs. > getBytes("[charsetname]") on some platforms (mac it seems) can be > substantial. From the java.lang.String javadoc for the former: > This method always replaces malformed-input and unmappable-character > sequences with this charset's default replacement byte array... > vs. the latter: > The behavior of this method when this string cannot be encoded in the default > charset is unspecified. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Issue Comment Edited: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001826#comment-13001826 ] Ed Anuff edited comment on CASSANDRA-2233 at 3/18/11 9:46 PM: -- bq.* generalize the type, make it deal with all kinds of variants/versions. (http://tools.ietf.org/html/rfc4122) -I'm not sure how useful the natural order of the other UUID versions is.- Minor point of correction, as per [Java bug 7025832|http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7025832], the LexicalUUIDType inherits java.util.UUID.compareTo()'s signed comparison flaw, since the LexicalUUIDType doesn't do byte comparison, it's converting the bytes into java.util.UUIDs and using the UUID class' compareTo() method. Doing a byte comparison would match the rfc, which is reflected in the most recent version of patch proposed. bq.* check if it makes sense to treat the Nil UUID differently (similarly to NULL in popular SQL databases). Yes, it does make sense. Nil UUID should always compare as less than. was (Author: edanuff): bq.* generalize the type, make it deal with all kinds of variants/versions. (http://tools.ietf.org/html/rfc4122) I'm not sure how useful the natural order of the other UUID versions is. I suppose for everything but version 1 time-based UUID's, we could use DCE's comparison rules rather than the current byte comparison used by the lexical comparer, that would at least be standard. http://www.opengroup.org/onlinepubs/9629399/apdxa.htm#tagtcjh_38 bq.* check if it makes sense to treat the Nil UUID differently (similarly to NULL in popular SQL databases). Yes, it does make sense. Nil UUID should always compare as less than. > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java, UUIDTypeTest.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2355) Have an easy way to define the reverse comparator of any comparator
[ https://issues.apache.org/jira/browse/CASSANDRA-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008510#comment-13008510 ] Ed Anuff commented on CASSANDRA-2355: - In Cassandra-2231 it was suggested that this be implemented by making comparators parameterizable. The idea would be to perhaps replace FBUtilities.getComparator() with a ComparatorFactory that could be passed something like "UUIDType(restrictTo=time,sort=desc)" and parse out the parameters in order to construct the instance. For Cassandra-2231, the proposed patch requires that FBUtilities.getComparator() caches and returns the same singleton comparator instances, so that requesting "UUIDType" will always return the same instance. It would be necessary to cache the parameterized version in a similar way, and would probably need to be able to know that "UUIDType(restrictTo=time,sort=desc)" and "UUIDType(sort=desc,restrictTo=time)" should return the same cached comparator. > Have an easy way to define the reverse comparator of any comparator > --- > > Key: CASSANDRA-2355 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2355 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Minor > Original Estimate: 4h > Remaining Estimate: 4h > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008505#comment-13008505 ] Ed Anuff commented on CASSANDRA-2231: - For parameterized behavior of comparators, my assumption is that this would work within the DynamicCompositeType as well? I'll add this to Cassandra-2235, but I'm thinking about the embedded comparator names in the dynamic format. Right now, you're simply calling FBUtilities.getComparator() with the name, but ultimately we'd need a more robust comparator factory that could be passed something like "UUIDType(restrictTo=time,sort=desc)" and parse out the parameters in order to construct the instance and was able to cache the parameterized version in a similar way to how your patch currently caches the comparators it instantiates, and would probably need to be able to know that "UUIDType(restrictTo=time,sort=desc)" and "UUIDType(sort=desc,restrictTo=time)" are the same comparator. > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Assignee: Sylvain Lebresne >Priority: Minor > Fix For: 0.7.5 > > Attachments: 0001-Add-compositeType-and-DynamicCompositeType.patch, > 0001-Add-compositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Issue Comment Edited: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008321#comment-13008321 ] Ed Anuff edited comment on CASSANDRA-2231 at 3/18/11 5:27 AM: -- -From the email conversation around this earlier today, I'm wondering if a bit in the trailing byte at the end of each component could be used for a sort order flag?- I'd like to suggest we put a byte just before the length/value part that if it's non-zero, reverses the comparer results. Both component parts must have the same sort order byte (i.e. both are 0 or both are 1) or a RuntimeException is thrown. For context, we're looking at doing something in the JPA implementation via annotations that's functionally similar to how App Engine defines indexes in it's index.yaml - http://code.google.com/appengine/docs/java/configyaml/indexconfig.html was (Author: edanuff): From the email conversation around this earlier today, I'm wondering if a bit in the trailing byte at the end of each component could be used for a sort order flag? For context, we're looking at doing something in the JPA implementation via annotations that's functionally similar to how App Engine defines indexes in it's index.yaml - http://code.google.com/appengine/docs/java/configyaml/indexconfig.html > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 0001-Add-compositeType-and-DynamicCompositeType.patch, > 0001-Add-compositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Issue Comment Edited: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008321#comment-13008321 ] Ed Anuff edited comment on CASSANDRA-2231 at 3/18/11 5:28 AM: -- -From the email conversation around this earlier today, I'm wondering if a bit in the trailing byte at the end of each component could be used for a sort order flag?- I'd like to suggest we put a byte just before the length/value part of the component that if it's non-zero, reverses the comparer results for that component. Both component parts must have the same sort order byte (i.e. both are 0 or both are 1) or a RuntimeException is thrown. For context, we're looking at doing something in the JPA implementation via annotations that's functionally similar to how App Engine defines indexes in it's index.yaml - http://code.google.com/appengine/docs/java/configyaml/indexconfig.html was (Author: edanuff): -From the email conversation around this earlier today, I'm wondering if a bit in the trailing byte at the end of each component could be used for a sort order flag?- I'd like to suggest we put a byte just before the length/value part that if it's non-zero, reverses the comparer results. Both component parts must have the same sort order byte (i.e. both are 0 or both are 1) or a RuntimeException is thrown. For context, we're looking at doing something in the JPA implementation via annotations that's functionally similar to how App Engine defines indexes in it's index.yaml - http://code.google.com/appengine/docs/java/configyaml/indexconfig.html > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 0001-Add-compositeType-and-DynamicCompositeType.patch, > 0001-Add-compositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Issue Comment Edited: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008321#comment-13008321 ] Ed Anuff edited comment on CASSANDRA-2231 at 3/18/11 4:30 AM: -- >From the email conversation around this earlier today, I'm wondering if a bit >in the trailing byte at the end of each component could be used for a sort >order flag? For context, we're looking at doing something in the JPA implementation via annotations that's functionally similar to how App Engine defines indexes in it's index.yaml - http://code.google.com/appengine/docs/java/configyaml/indexconfig.html was (Author: edanuff): From the email conversation around this earlier today, I'm wondering if a bit in the trailing byte at the end of each component could be used for a sort order flag? > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 0001-Add-compositeType-and-DynamicCompositeType.patch, > 0001-Add-compositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008321#comment-13008321 ] Ed Anuff commented on CASSANDRA-2231: - >From the email conversation around this earlier today, I'm wondering if a bit >in the trailing byte at the end of each component could be used for a sort >order flag? > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 0001-Add-compositeType-and-DynamicCompositeType.patch, > 0001-Add-compositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004780#comment-13004780 ] Ed Anuff commented on CASSANDRA-2233: - Yes, that is correct. > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java, UUIDTypeTest.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004773#comment-13004773 ] Ed Anuff commented on CASSANDRA-2233: - Attached new version and test case. It now sorts first by version number, then by time if both are version 1, and then lexically using msb to lsb byte comparison. Turns out this is actually the same way the org.safehaus.uuid.UUID.compareTo() method works, so there's precedent. FWIW, the current LexicalUUIDType is using java.util.UUID.compareTo() which, looking at the JDK source and doing some tests, appears not to be a lexical comparison as described in ref4122 since it's doing a signed comparison. > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java, UUIDTypeTest.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Issue Comment Edited: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004773#comment-13004773 ] Ed Anuff edited comment on CASSANDRA-2233 at 3/9/11 8:17 PM: - Attached new version and test case. It now sorts first by version number, then by time if both are version 1, and then lexically using msb to lsb byte comparison. Turns out this is actually the same way the org.safehaus.uuid.UUID.compareTo() method works, so there's precedent. FWIW, the current LexicalUUIDType is using java.util.UUID.compareTo() which, looking at the JDK source and doing some tests, appears not to be a lexical comparison as described in rfc4122 since it's doing a signed comparison. was (Author: edanuff): Attached new version and test case. It now sorts first by version number, then by time if both are version 1, and then lexically using msb to lsb byte comparison. Turns out this is actually the same way the org.safehaus.uuid.UUID.compareTo() method works, so there's precedent. FWIW, the current LexicalUUIDType is using java.util.UUID.compareTo() which, looking at the JDK source and doing some tests, appears not to be a lexical comparison as described in ref4122 since it's doing a signed comparison. > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java, UUIDTypeTest.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2233: Attachment: UUIDTypeTest.java > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java, UUIDTypeTest.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2233: Attachment: UUIDType.java > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2233: Attachment: (was: UUIDType.java) > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002226#comment-13002226 ] Ed Anuff commented on CASSANDRA-2231: - Hmm, that would work, although you certainly wouldn't want to use the LongType as your integer. I guess the minimum overhead for a component is 6 bytes - 2 header, 2 length, 1 value, 1 inclusion flag. I'm not seeing anything else that wouldn't let me use this as a functional replacement for the original CompositeType, so I'm +1 on it. Thanks! > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 0001-Add-compositeType-and-DynamicCompositeType.patch, > 0001-Add-compositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002170#comment-13002170 ] Ed Anuff commented on CASSANDRA-2231: - If two dynamic composite types are compared, the first and the second , this results in an exception being thrown in line 659, correct? In the original CompositeType, the component types each had an ordinal type value and the comparison was done on those type values if the components were of different types. I might suggest that in your code using the alias character byte or the hashCode() of the classname as the type value and doing a similar comparison, rather than throwing an exception. > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 0001-Add-compositeType-and-DynamicCompositeType.patch, > 0001-Add-compositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002144#comment-13002144 ] Ed Anuff commented on CASSANDRA-2231: - bq.Greater-than is already doable in my previous patch (up to the bug in validation). For the less-than part, I agree that it is nice to be able to do it easily. In my new patch, I add a leading byte to each component, whose purpose is to always be 0, except for lesser-than query. That way, you can do the query above easily. The price is a slightly more complicated encoding but I think it's totally worth it. Just to be clear, the original idea was to make it possible to construct a key for the purposes of doing a range slice that would compare inclusive either or both at the start and finish of the range. This appears to be possible with the "inclusion byte" that you're using in lines 179 through 184 of your patch. Is that correct? > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 0001-Add-compositeType-and-DynamicCompositeType.patch, > 0001-Add-compositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002118#comment-13002118 ] Ed Anuff commented on CASSANDRA-2231: - Sylvain, this looks like it could do the trick. Just to clarify, in your example, if I declare something like this: DynamicCompositeType(b => BytesType, t => TimeUUIDType) Does this mean that I can have any number of components in my dynamic composite key and that "b" and "t" are aliases to BytesType and TimeUUIDType for the purpose of space efficienct? Using both or neither of those aliases is valid and the order in which I use them isn't mandated, correct? > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 0001-Add-compositeType-and-DynamicCompositeType.patch, > 0001-Add-compositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001826#comment-13001826 ] Ed Anuff commented on CASSANDRA-2233: - bq.* generalize the type, make it deal with all kinds of variants/versions. (http://tools.ietf.org/html/rfc4122) I'm not sure how useful the natural order of the other UUID versions is. I suppose for everything but version 1 time-based UUID's, we could use DCE's comparison rules rather than the current byte comparison used by the lexical comparer, that would at least be standard. http://www.opengroup.org/onlinepubs/9629399/apdxa.htm#tagtcjh_38 bq.* check if it makes sense to treat the Nil UUID differently (similarly to NULL in popular SQL databases). Yes, it does make sense. Nil UUID should always compare as less than. > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001752#comment-13001752 ] Ed Anuff commented on CASSANDRA-2231: - bq.Why? How many indexes are you creating? Do you mean how many indexes or how many index types? Lots of relatively small indexes, one index potentially for every relationship, but I'm not sure that's what you meant. In terms of index types, without a dynamic capability, then if I want to create an index on integer values, that's one CF, if I want to create an index on string values, that's another CF, if I want to create an index sorted first by lastname, then by firstname, that's another CF. I tried that approach and it made for some fairly convoluted code, but more concerning, I had close to 20 CFs, since maintaining a CF index requires at least one other CFs to store related metadata. I was able to consolidate that down to about 4 CFs, much more manageable code and Cassandra became a much happier camper. > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 0001-Add-compositeType.patch, > edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001671#comment-13001671 ] Ed Anuff commented on CASSANDRA-2231: - Ok, I'll let the "hacky" part slide. :) Sylvain's approach is much more elegant in how it's declared, and using the existing comparers is nice and something I did initially until I decided I'd rather have everything under one roof for performance and maintainability as well as in order to add some things to make queries easier. So, I'm in agreement with the way he's proposing. The central question here is does the final implementation provide sufficiently flexibility for index building. That's why a composite type is necessary. I have similar concerns to Todd, losing the dynamic capability is actually bit more of a problem the more I think about it, I'd look to solve that in two ways (1) having validation only require that the components, if they're provided, are of the correct type (looking at Sylvain's code, the compare function should work, but the validation wouldn't), and (2) creating a DynamicType that could be used for dynamic types. Given that a number of the languages that are talking to Cassandra are dynamic languages and even people using Java, like me, might be using JSON types, some form of dynamic support is a good idea, but I'd be happy to separate that into a different comparator. My other concern is that one of the things that got stripped out was the MATCH_MINIMUM, MATCH_MAXIMUM feature, which made implementing greater-than-equals, less-than-equals, etc. in ranges much easier. That might be the wrong place to implement that, though, and might be a bit of a hack. We've got somewhat of a challenge at the application tier, which we're seeing in the Hector project, in terms of providing a uniform way to do the type of indexing needed for ORM, which the current secondary indexes just don't satisfy. The benefit of why we need a sufficiently flexible composite mechanism in core is because we're already assuming the capability in order to implement this stuff. It doesn't have to be my code or my format, but it really should meet the needs of the app builders. Might a more forgiving version of Sylvain's CompositeType plus a new DynamicType be the way to go? > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 0001-Add-compositeType.patch, > edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001608#comment-13001608 ] Ed Anuff commented on CASSANDRA-2231: - This makes sense. In practice, I've made use of the fact that the embedded types were dynamic to arbitrarily store additional metadata in the column names, which this is going to preclude due to being strongly typed, but I think that can be worked around here. > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: 0001-Add-compositeType.patch, > edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Issue Comment Edited: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001480#comment-13001480 ] Ed Anuff edited comment on CASSANDRA-2233 at 3/2/11 4:49 PM: - Yes, I would think so. Caveat: deprecate == not recommend for new usage, existing column families using the LexicalUUID have to stay with it, for the reason Frank points out. was (Author: edanuff): Yes, I would think so. > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001480#comment-13001480 ] Ed Anuff commented on CASSANDRA-2233: - Yes, I would think so. > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000479#comment-13000479 ] Ed Anuff commented on CASSANDRA-2233: - This wasn't necessarily suggested as a replacement for the two existing comparators but as an alternative for: (1) convenience (2) being able to switch your UUID generation technique later and while the ordering might not be useful, it would still be predictable. What happens now if you start with time-based UUIDs and switch to lexical? (3) use foreign generated UUID's with a preference for time-based sorting if possible > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2233: Attachment: UUIDType.java Removed incorrect copyright notice > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2233: Attachment: (was: UUIDType.java) > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2233: Attachment: UUIDType.java Revised to fix comparison between time-based and non-time-based UUIDs. Time-based UUIDs now always compare as less than non-time-based UUIDs. > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2233: Attachment: (was: UUIDType.java) > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000377#comment-13000377 ] Ed Anuff commented on CASSANDRA-2233: - You are correct, I'll fix that. > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2233: Component/s: (was: Contrib) Core > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-1684) Entity groups
[ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998732#comment-12998732 ] Ed Anuff commented on CASSANDRA-1684: - This is something I've been thinking about while consolidating the number of column families within an application so that I ended up with row keys that were constructed from concatenating an entity id with various other strings (eg. 9081bd70-3fe4-11e0-9207-0800200c9a66:something ). Is it feasible to have a partitioner that hashed on just the first x bytes in a key? Do tokens have to be one-to-one unique with keys, or could you have multiple keys share the same token? (apparently that's currently possible, although an extreme edge case, with the RandomPartitioner) > Entity groups > - > > Key: CASSANDRA-1684 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1684 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Sylvain Lebresne > Fix For: 0.8 > > Original Estimate: 80h > Remaining Estimate: 80h > > Supporting entity groups similar to App Engine's (that is, allow rows to be > part of a parent "entity group," whose key is used for routing instead of the > row itself) allows several improvements: > - batches within an EG can be atomic across multiple rows > - order-by-value queries within an EG only have to touch a single replica > even with RandomPartitioner -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2233) Add unified UUIDType
[ https://issues.apache.org/jira/browse/CASSANDRA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2233: Attachment: UUIDType.java > Add unified UUIDType > > > Key: CASSANDRA-2233 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: UUIDType.java > > > Unified UUIDType comparator, compares as time-based if both UUIDs are > time-based, otherwise uses byte comparison. Based on code from the current > LexicalUUIDType and TimeUUIDType comparers, so performance and behavior > should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (CASSANDRA-2233) Add unified UUIDType
Add unified UUIDType Key: CASSANDRA-2233 URL: https://issues.apache.org/jira/browse/CASSANDRA-2233 Project: Cassandra Issue Type: Improvement Components: Contrib Affects Versions: 0.7.3 Reporter: Ed Anuff Priority: Minor Unified UUIDType comparator, compares as time-based if both UUIDs are time-based, otherwise uses byte comparison. Based on code from the current LexicalUUIDType and TimeUUIDType comparers, so performance and behavior should be consistent and compatible. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Anuff updated CASSANDRA-2231: Attachment: edanuff-CassandraCompositeType-1e253c4.zip https://github.com/edanuff/CassandraCompositeType > Add CompositeType comparer to the comparers provided in > org.apache.cassandra.db.marshal > --- > > Key: CASSANDRA-2231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Affects Versions: 0.7.3 >Reporter: Ed Anuff >Priority: Minor > Attachments: edanuff-CassandraCompositeType-1e253c4.zip > > > CompositeType is a custom comparer that makes it possible to create > comparable composite values out of the basic types that Cassandra currently > supports, such as Long, UUID, etc. This is very useful in both the creation > of custom inverted indexes using columns in a skinny row, where each column > name is a composite value, and also when using Cassandra's built-in secondary > index support, where it can be used to encode the values in the columns that > Cassandra indexes. One scenario for the usage of these is documented here: > http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for > contribution is attached and has been previously maintained on github here: > https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal --- Key: CASSANDRA-2231 URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 Project: Cassandra Issue Type: Improvement Components: Contrib Affects Versions: 0.7.3 Reporter: Ed Anuff Priority: Minor CompositeType is a custom comparer that makes it possible to create comparable composite values out of the basic types that Cassandra currently supports, such as Long, UUID, etc. This is very useful in both the creation of custom inverted indexes using columns in a skinny row, where each column name is a composite value, and also when using Cassandra's built-in secondary index support, where it can be used to encode the values in the columns that Cassandra indexes. One scenario for the usage of these is documented here: http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for contribution is attached and has been previously maintained on github here: https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira