xin jin created CASSANDRA-13904:
-----------------------------------

             Summary: Performance improvement of Cassandra UDF/UDA
                 Key: CASSANDRA-13904
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13904
             Project: Cassandra
          Issue Type: Improvement
          Components: CQL
            Reporter: xin jin
            Priority: Critical
             Fix For: 3.11.x


Hi All,

We have made a few experiments and found that running query with direct UDF 
execution is ten time more faster than the async UDF execution. The in-line 
comment: "Using async UDF execution is expensive (adds about 100us overhead per 
invocation on a Core-i7 MBPr)” 
https://insight.io/github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/UDFunction.java?line=293
 show that this is a known behavior.  My questions are as below:

1. What are the main pros and cons of these two methods? Can I find any 
documents that discuss this?  

2. Are there any plans to improve the performance of using async UDF? A simple 
way come to my mind is to use some sort of batch method, e.g., replace current 
row by row method with some rows by some rows. Are there any concerns on this?

3. How people solve this performance issue in general? It seems this 
performance issue is not an urgent or an important issue to solve because it is 
known and it is still there. Therefore people must have some sort of good 
solution solving this issue. 

I really appreciate your comments in advance.

Best regards,

Xin




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to