git commit: Mark sstables as repaired after full repair

2014-11-02 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/trunk 9274197b4 -> e60a06cc8


Mark sstables as repaired after full repair

Patch by marcuse; reviewed by yukim for CASSANDRA-7586


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e60a06cc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e60a06cc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e60a06cc

Branch: refs/heads/trunk
Commit: e60a06cc866e5e85d3e58f25b98f8c048d07ad24
Parents: 9274197
Author: Marcus Eriksson 
Authored: Tue Oct 28 16:30:50 2014 +0100
Committer: Marcus Eriksson 
Committed: Mon Nov 3 08:28:39 2014 +0100

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/ColumnFamilyStore.java  | 13 +++--
 .../db/compaction/CompactionManager.java| 24 ++--
 .../repair/RepairMessageVerbHandler.java| 23 +---
 .../repair/messages/AnticompactionRequest.java  |  8 +++
 .../repair/messages/PrepareMessage.java | 10 +++-
 .../cassandra/repair/messages/RepairOption.java |  7 ---
 .../cassandra/repair/messages/SyncRequest.java  | 11 
 .../repair/messages/ValidationRequest.java  |  8 +++
 .../cassandra/service/ActiveRepairService.java  | 61 +++-
 .../cassandra/service/StorageService.java   | 44 +-
 .../LeveledCompactionStrategyTest.java  |  2 +-
 .../cassandra/repair/LocalSyncTaskTest.java |  2 +-
 13 files changed, 127 insertions(+), 87 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e60a06cc/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index db3b091..3a8ada2 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0
+ * Mark sstables as repaired after full repair (CASSANDRA-7586) 
  * Extend Descriptor to include a format value and refactor reader/writer apis 
(CASSANDRA-7443)
  * Integrate JMH for microbenchmarks (CASSANDRA-8151)
  * Keep sstable levels when bootstrapping (CASSANDRA-7460)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e60a06cc/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 0e3131c..2a61b39 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -2151,8 +2151,9 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 snapshotWithoutFlush(snapshotName, null);
 }
 
-public void snapshotWithoutFlush(String snapshotName, 
Predicate predicate)
+public Set snapshotWithoutFlush(String snapshotName, 
Predicate predicate)
 {
+Set snapshottedSSTables = new HashSet<>();
 for (ColumnFamilyStore cfs : concatWithIndexes())
 {
 DataTracker.View currentView = cfs.markCurrentViewReferenced();
@@ -2171,6 +2172,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 
filesJSONArr.add(ssTable.descriptor.relativeFilenameFor(Component.DATA));
 if (logger.isDebugEnabled())
 logger.debug("Snapshot for {} keyspace data file {} 
created in {}", keyspace, ssTable.getFilename(), snapshotDirectory);
+snapshottedSSTables.add(ssTable);
 }
 
 writeSnapshotManifest(filesJSONArr, snapshotName);
@@ -2180,6 +2182,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 SSTableReader.releaseReferences(currentView.sstables);
 }
 }
+return snapshottedSSTables;
 }
 
 private void writeSnapshotManifest(final JSONArray filesJSONArr, final 
String snapshotName)
@@ -2216,15 +2219,15 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
  *
  * @param snapshotName the name of the associated with the snapshot
  */
-public void snapshot(String snapshotName)
+public Set snapshot(String snapshotName)
 {
-snapshot(snapshotName, null);
+return snapshot(snapshotName, null);
 }
 
-public void snapshot(String snapshotName, Predicate 
predicate)
+public Set snapshot(String snapshotName, 
Predicate predicate)
 {
 forceBlockingFlush();
-snapshotWithoutFlush(snapshotName, predicate);
+return snapshotWithoutFlush(snapshotName, predicate);
 }
 
 public boolean snapshotExists(String snapshotName)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e60a06cc/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
---

[jira] [Commented] (CASSANDRA-7586) Mark SSTables as repaired after full repairs

2014-11-02 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14194331#comment-14194331
 ] 

Marcus Eriksson commented on CASSANDRA-7586:


committed (with a log level change (info -> debug) in RepairMessageVerbHandler)

> Mark SSTables as repaired after full repairs
> 
>
> Key: CASSANDRA-7586
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7586
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0
>
>
> In 2.1 we avoided anticompaction and marking sstables as repaired after 
> old-style full repairs (reasoning was that we wanted users to be able to 
> carry on as before)
> In 3.0 incremental repairs is on by default and we should always mark and 
> anticompact sstables



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8243) DTCS can leave time-overlaps, limiting ability to expire entire SSTables

2014-11-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Björn Hegerfors updated CASSANDRA-8243:
---
Attachment: cassandra-trunk-CASSANDRA-8243-aggressiveTTLExpiry.txt

I've made a simple change in the getFullyExpiredSSTables method (removing one 
line did the trick) which drops an SSTable as long as it is fully expired and 
has getMaxTimestamp less than the getMinTimestamp of any (overlapping) SSTable 
which contains any column that's still alive. The difference between this 
condition and the previous is subtle, but to my understanding, the old 
condition was being unnecessarily cautious. This one should be safe and it will 
certainly solve this issue.

But of course, if this is wrong, then that could be a serious bug. So this has 
to be carefully reviewed.

> DTCS can leave time-overlaps, limiting ability to expire entire SSTables
> 
>
> Key: CASSANDRA-8243
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8243
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Björn Hegerfors
>Assignee: Björn Hegerfors
>Priority: Minor
>  Labels: compaction, performance
> Fix For: 2.0.12, 2.1.2
>
> Attachments: cassandra-trunk-CASSANDRA-8243-aggressiveTTLExpiry.txt
>
>
> CASSANDRA-6602 (DTCS) and CASSANDRA-5228 are supposed to be a perfect match 
> for tables where every value is written with a TTL. DTCS makes sure to keep 
> old data separate from new data. So shortly after the TTL has passed, 
> Cassandra should be able to throw away the whole SSTable containing a given 
> data point.
> CASSANDRA-5228 deletes the very oldest SSTables, and only if they don't 
> overlap (in terms of timestamps) with another SSTable which cannot be deleted.
> DTCS however, can't guarantee that SSTables won't overlap (again, in terms of 
> timestamps). In a test that I ran, every single SSTable overlapped with its 
> nearest neighbors by a very tiny amount. My reasoning for why this could 
> happen is that the dumped memtables were already overlapping from the start. 
> DTCS will never create an overlap where there is none. I surmised that this 
> happened in my case because I sent parallel writes which must have come out 
> of order. This was just locally, and out of order writes should be much more 
> common non-locally.
> That means that the SSTable removal optimization may never get a chance to 
> kick in!
> I can see two solutions:
> 1. Make DTCS split SSTables on time window borders. This will essentially 
> only be done on a newly dumped memtable once every base_time_seconds.
> 2. Make TTL SSTable expiry more aggressive. Relax the conditions on which an 
> SSTable can be dropped completely, of course without affecting any semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8243) DTCS can leave time-overlaps, limiting ability to expire entire SSTables

2014-11-02 Thread JIRA
Björn Hegerfors created CASSANDRA-8243:
--

 Summary: DTCS can leave time-overlaps, limiting ability to expire 
entire SSTables
 Key: CASSANDRA-8243
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8243
 Project: Cassandra
  Issue Type: Bug
Reporter: Björn Hegerfors
Assignee: Björn Hegerfors
Priority: Minor
 Fix For: 2.0.12, 2.1.2


CASSANDRA-6602 (DTCS) and CASSANDRA-5228 are supposed to be a perfect match for 
tables where every value is written with a TTL. DTCS makes sure to keep old 
data separate from new data. So shortly after the TTL has passed, Cassandra 
should be able to throw away the whole SSTable containing a given data point.

CASSANDRA-5228 deletes the very oldest SSTables, and only if they don't overlap 
(in terms of timestamps) with another SSTable which cannot be deleted.

DTCS however, can't guarantee that SSTables won't overlap (again, in terms of 
timestamps). In a test that I ran, every single SSTable overlapped with its 
nearest neighbors by a very tiny amount. My reasoning for why this could happen 
is that the dumped memtables were already overlapping from the start. DTCS will 
never create an overlap where there is none. I surmised that this happened in 
my case because I sent parallel writes which must have come out of order. This 
was just locally, and out of order writes should be much more common 
non-locally.

That means that the SSTable removal optimization may never get a chance to kick 
in!

I can see two solutions:
1. Make DTCS split SSTables on time window borders. This will essentially only 
be done on a newly dumped memtable once every base_time_seconds.
2. Make TTL SSTable expiry more aggressive. Relax the conditions on which an 
SSTable can be dropped completely, of course without affecting any semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8194) Reading from Auth table should not be in the request path

2014-11-02 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-8194:
-
  Reviewer:   (was: Aleksey Yeschenko)
  Priority: Minor  (was: Major)
  Assignee: (was: Vishy Kasar)
Issue Type: Improvement  (was: Bug)

This is indeed planned - for some future version, when there is someone with 
spare cycles to deal with it.

> Reading from Auth table should not be in the request path
> -
>
> Key: CASSANDRA-8194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8194
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vishy Kasar
>Priority: Minor
>
> We use PasswordAuthenticator and PasswordAuthorizer. The system_auth has a RF 
> of 10 per DC over 2 DCs. The permissions_validity_in_ms is 5 minutes. 
> We still have few thousand requests failing each day with the trace below. 
> The reason for this is read cache request realizing that cached entry has 
> expired and doing a blocking request to refresh cache. 
> We should have cache refreshed periodically only in the back ground. The user 
> request should simply look at the cache and not try to refresh it. 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
> received only 0 responses.
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2258)
>   at com.google.common.cache.LocalCache.get(LocalCache.java:3990)
>   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3994)
>   at 
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4878)
>   at 
> org.apache.cassandra.service.ClientState.authorize(ClientState.java:292)
>   at 
> org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:172)
>   at 
> org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:165)
>   at 
> org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:149)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement.checkAccess(ModificationStatement.java:75)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:102)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:113)
>   at 
> org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1735)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4162)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4150)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
>   at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>   at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
> received only 0 responses.
>   at org.apache.cassandra.auth.Auth.selectUser(Auth.java:256)
>   at org.apache.cassandra.auth.Auth.isSuperuser(Auth.java:84)
>   at 
> org.apache.cassandra.auth.AuthenticatedUser.isSuper(AuthenticatedUser.java:50)
>   at 
> org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:68)
>   at org.apache.cassandra.service.ClientState$1.load(ClientState.java:278)
>   at org.apache.cassandra.service.ClientState$1.load(ClientState.java:275)
>   at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589)
>   at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374)
>   at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337)
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252)
>   ... 19 more
> Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation 
> timed out - received only 0 responses.
>   at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:105)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:943)
>   at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:828)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:140)
>   at org.apache.cassandra.auth.Auth.selectUser(Auth.java:245)
>   ... 28 more
> ERRO

[jira] [Updated] (CASSANDRA-7813) Decide how to deal with conflict between native and user-defined functions

2014-11-02 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7813:

Attachment: 7813v5.txt

> Decide how to deal with conflict between native and user-defined functions
> --
>
> Key: CASSANDRA-7813
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7813
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Robert Stupp
>  Labels: cql
> Fix For: 3.0
>
> Attachments: 7813.txt, 7813v2.txt, 7813v3.txt, 7813v4.txt, 7813v5.txt
>
>
> We have a bunch of native/hardcoded functions (now(), dateOf(), ...) and in 
> 3.0, user will be able to define new functions. Now, there is a very high 
> change that we will provide more native functions over-time (to be clear, I'm 
> not particularly for adding native functions for allthethings just because we 
> can, but it's clear that we should ultimately provide more than what we 
> have). Which begs the question: how do we want to deal with the problem of 
> adding a native function potentially breaking a previously defined 
> user-defined function?
> A priori I see the following options (maybe there is more?):
> # don't do anything specific, hoping that it won't happen often and consider 
> it a user problem if it does.
> # reserve a big number of names that we're hoping will cover all future need.
> # make native function and user-defined function syntactically distinct so it 
> cannot happen.
> I'm not a huge fan of solution 1). Solution 2) is actually what we did for 
> UDT but I think it's somewhat less practical here: there is so much types 
> that it makes sense to provide natively and so it wasn't too hard to come up 
> with a reasonably small list of types name to reserve just in case. This 
> feels a lot harder for functions to me.
> Which leaves solution 3). Since we already have the concept of namespaces for 
> functions, a simple idea would be to force user function to have namespace. 
> We could even allow that namespace to be empty as long as we force the 
> namespace separator (so we'd allow {{bar::foo}} and {{::foo}} for user 
> functions, but *not* {{foo}} which would be reserved for native function).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7813) Decide how to deal with conflict between native and user-defined functions

2014-11-02 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7813:

Attachment: (was: 7813v5.txt)

> Decide how to deal with conflict between native and user-defined functions
> --
>
> Key: CASSANDRA-7813
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7813
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Robert Stupp
>  Labels: cql
> Fix For: 3.0
>
> Attachments: 7813.txt, 7813v2.txt, 7813v3.txt, 7813v4.txt, 7813v5.txt
>
>
> We have a bunch of native/hardcoded functions (now(), dateOf(), ...) and in 
> 3.0, user will be able to define new functions. Now, there is a very high 
> change that we will provide more native functions over-time (to be clear, I'm 
> not particularly for adding native functions for allthethings just because we 
> can, but it's clear that we should ultimately provide more than what we 
> have). Which begs the question: how do we want to deal with the problem of 
> adding a native function potentially breaking a previously defined 
> user-defined function?
> A priori I see the following options (maybe there is more?):
> # don't do anything specific, hoping that it won't happen often and consider 
> it a user problem if it does.
> # reserve a big number of names that we're hoping will cover all future need.
> # make native function and user-defined function syntactically distinct so it 
> cannot happen.
> I'm not a huge fan of solution 1). Solution 2) is actually what we did for 
> UDT but I think it's somewhat less practical here: there is so much types 
> that it makes sense to provide natively and so it wasn't too hard to come up 
> with a reasonably small list of types name to reserve just in case. This 
> feels a lot harder for functions to me.
> Which leaves solution 3). Since we already have the concept of namespaces for 
> functions, a simple idea would be to force user function to have namespace. 
> We could even allow that namespace to be empty as long as we force the 
> namespace separator (so we'd allow {{bar::foo}} and {{::foo}} for user 
> functions, but *not* {{foo}} which would be reserved for native function).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5863) In process (uncompressed) page cache

2014-11-02 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193874#comment-14193874
 ] 

T Jake Luciani commented on CASSANDRA-5863:
---

bq. why not just take a LRU approach?

This is what my initial attempt does mostly but the perf impact of always 
putting stuff into the cache is high (since it uses off heap memcopy)

I can resurect this code and show how it looks.  Perhaps the new cache impl 
Vijay is working on will improve this.

> In process (uncompressed) page cache
> 
>
> Key: CASSANDRA-5863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5863
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: T Jake Luciani
>  Labels: performance
> Fix For: 3.0
>
>
> Currently, for every read, the CRAR reads each compressed chunk into a 
> byte[], sends it to ICompressor, gets back another byte[] and verifies a 
> checksum.  
> This process is where the majority of time is spent in a read request.  
> Before compression, we would have zero-copy of data and could respond 
> directly from the page-cache.
> It would be useful to have some kind of Chunk cache that could speed up this 
> process for hot data, possibly off heap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7813) Decide how to deal with conflict between native and user-defined functions

2014-11-02 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7813:

Attachment: 7813v5.txt

Attached v5 of patch that fixes an inconsistency with upper/lower case function 
names (quoted=keep case, unquoted=toLowerCase).

> Decide how to deal with conflict between native and user-defined functions
> --
>
> Key: CASSANDRA-7813
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7813
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Robert Stupp
>  Labels: cql
> Fix For: 3.0
>
> Attachments: 7813.txt, 7813v2.txt, 7813v3.txt, 7813v4.txt, 7813v5.txt
>
>
> We have a bunch of native/hardcoded functions (now(), dateOf(), ...) and in 
> 3.0, user will be able to define new functions. Now, there is a very high 
> change that we will provide more native functions over-time (to be clear, I'm 
> not particularly for adding native functions for allthethings just because we 
> can, but it's clear that we should ultimately provide more than what we 
> have). Which begs the question: how do we want to deal with the problem of 
> adding a native function potentially breaking a previously defined 
> user-defined function?
> A priori I see the following options (maybe there is more?):
> # don't do anything specific, hoping that it won't happen often and consider 
> it a user problem if it does.
> # reserve a big number of names that we're hoping will cover all future need.
> # make native function and user-defined function syntactically distinct so it 
> cannot happen.
> I'm not a huge fan of solution 1). Solution 2) is actually what we did for 
> UDT but I think it's somewhat less practical here: there is so much types 
> that it makes sense to provide natively and so it wasn't too hard to come up 
> with a reasonably small list of types name to reserve just in case. This 
> feels a lot harder for functions to me.
> Which leaves solution 3). Since we already have the concept of namespaces for 
> functions, a simple idea would be to force user function to have namespace. 
> We could even allow that namespace to be empty as long as we force the 
> namespace separator (so we'd allow {{bar::foo}} and {{::foo}} for user 
> functions, but *not* {{foo}} which would be reserved for native function).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-02 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193785#comment-14193785
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

bq. I don't think we changed the format, did i?
Ah - no. Sorry - got confused with the in-memory serialization.

bq. item.refcount
What I mean is the the (Intel) CPU L1+L2 cache line size (64 bytes). If 
'refcount' is updated (e.g. just for a cache-get), the whole cache line is 
invalidated (twice) and needs to be re-fetched from RAM although its content 
did not change. It's just a point for optimization - if we find a viable 
solution for that, we should implement it.

> Serializing Row cache alternative (Fully off heap)
> --
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Linux
>Reporter: Vijay
>Assignee: Vijay
>  Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch
>
>
> Currently SerializingCache is partially off heap, keys are still stored in 
> JVM heap as BB, 
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better 
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off 
> heap and use JNI to interact with cache. We might want to ensure that the new 
> implementation match the existing API's (ICache), and the implementation 
> needs to have safe memory access, low overhead in memory and less memcpy's 
> (As much as possible).
> We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7874) Validate functionality of different JSR-223 providers in UDFs

2014-11-02 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193779#comment-14193779
 ] 

Robert Stupp edited comment on CASSANDRA-7874 at 11/2/14 10:46 AM:
---

Of course not! By bad; re-created the patch and the changes are not in there - 
no idea where these came from - sorry about that.
Ah - now I know! I've merged on Oct 11th at 12:34 but created the patch some 
hours later effectively reverting another commit... My bad.


was (Author: snazy):
Of course not! By bad; re-created the patch and the changes are not in there - 
no idea where these came from - sorry about that.

> Validate functionality of different JSR-223 providers in UDFs
> -
>
> Key: CASSANDRA-7874
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7874
> Project: Cassandra
>  Issue Type: Task
>  Components: Core
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>  Labels: udf
> Attachments: 7874.txt, 7874v2.txt, 7874v3.txt, 7874v4.txt
>
>
> CASSANDRA-7526 introduces ability to support optional JSR-223 providers like 
> Clojure, Jython, Groovy or JRuby.
> This ticket is about to test functionality with these providers but not to 
> include them in C* distribution.
> Expected result is a "how to" document, wiki page or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7874) Validate functionality of different JSR-223 providers in UDFs

2014-11-02 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7874:

Attachment: 7874v4.txt

Of course not! By bad; re-created the patch and the changes are not in there - 
no idea where these came from - sorry about that.

> Validate functionality of different JSR-223 providers in UDFs
> -
>
> Key: CASSANDRA-7874
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7874
> Project: Cassandra
>  Issue Type: Task
>  Components: Core
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>  Labels: udf
> Attachments: 7874.txt, 7874v2.txt, 7874v3.txt, 7874v4.txt
>
>
> CASSANDRA-7526 introduces ability to support optional JSR-223 providers like 
> Clojure, Jython, Groovy or JRuby.
> This ticket is about to test functionality with these providers but not to 
> include them in C* distribution.
> Expected result is a "how to" document, wiki page or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-02 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193760#comment-14193760
 ] 

Vijay commented on CASSANDRA-7438:
--

Pushed, Thanks!
{quote}
We should ensure that changes in the serialized format of saved row caches are 
detected
{quote}
I don't think we changed the format, did i?
{quote}
 item.refcount - it refcount is updated, the whole cache line needs to be 
re-fetched (CPU)
{quote}
The refcount is per item in the cache, for every item inserted we track this in 
its memory location. 

> Serializing Row cache alternative (Fully off heap)
> --
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Linux
>Reporter: Vijay
>Assignee: Vijay
>  Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch
>
>
> Currently SerializingCache is partially off heap, keys are still stored in 
> JVM heap as BB, 
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better 
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off 
> heap and use JNI to interact with cache. We might want to ensure that the new 
> implementation match the existing API's (ICache), and the implementation 
> needs to have safe memory access, low overhead in memory and less memcpy's 
> (As much as possible).
> We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)