[warp10-users] Warpscript preformance

Fabien S. Wed, 09 Dec 2020 10:48:17 -0800

Hello Warp10 Users !

First of all,  I'm sorry for my english.


I wonder about the performance of Warp10 and especially with WarpScipt.
I did a test on database access and calculation to check the performances.
Then I compare it with a Cassandra database where I store time series and a 
Java program.

The test takes 2 days of data, do a resampling, perform a filter operation 
between two values with 2 MAP.
I used the script from 
https://www.cerenit.fr/blog/timeseries-state-duration/ to to this test.
I used the configuration of Warp10 of the docker image with the ci suffix.

Here is the command which allows you to set up the data set on Warp10.
I generate a time serie ‘test’ of one year of data with a ramp.

echo "" | awk ' 
{
    # My system is in mico seconds.
    startTs = 1577836800000000 # Stat ts at 2020-01-01T00:00:00Z
    print startTs"// test{name=oneyeardata,signal=voltage} 0.0"
    
    oneDayInSec = 24 * 60 * 60
    oneYearInSec = 365 * oneDayInSec

    # We insert 4 ramps in one day
    for (nbSec = 1; nbSec < oneYearInSec; nbSec++) 
    {
        print "=" startTs + nbSec * 1000000 "// " nbSec % (oneDayInSec / 4) 
".0"
    }
}' | curl -v -H 'X-Warp10-Token: writeTokenCI' -H 'Transfer-Encoding: 
chunked' -X POST --data-binary @- "http://localhost:8080/api/v0/update";


Here is the WarpScript to do the calculation

// This script is based on the script found here : 
https://www.cerenit.fr/blog/timeseries-state-duration/
 
// Storing the token into a variable
'readTokenCI' 'token' STORE
$token AUTHENTICATE
2147483647 LIMIT
2147483647 MAXBUCKETS 
 
1577836800 1 s * 'start' STORE // Number of second form epoch to the 
2020-01-01T00:00:00Z
86400 1 s * 'one_day' STORE
 
 
// Step 1 - Start the FIND with the token as first parameter
[
    $token
    // Here you must put the classname and label selectors...
    'test'
    { 'signal' 'voltage' 'name' 'oneyeardata' }
    $start 2 d + 
    $start 
] FETCH  0 GET 
 
// Step 2 - Resampling 
[ SWAP bucketizer.last 0 10000 0 ] BUCKETIZE 0 GET 
 
DUP // Keep data we fetch and work on a clone of these data.
 
// Step 3 - Apply the filter with MAP function
// Filter values from 20 to 2000
[ SWAP 20.0 mapper.ge 0 0 0 ] MAP
[ SWAP 2000.0 mapper.lt 0 0 0 ] MAP
 

To execute the WarpScript with WarpStudio:

   - Step 1 - 300ms to fetch 2 days of data 
   - Step 2 - 900ms to resampling
   - Step 3 - 6 seconds to filter.

To do the same thing in Cassandra + Java, I got:

   - Step 1 - 600ms - Cassandra server is on the same machine as the java 
   code is running
   - Step 2 - 131ms to resampling
   - Step 3 - 1ms to filter.
   

I did the test with my laptop :

   - i7-8650U 8gth Generation
   - 32 Gb ram
   - SSD



I think I didn’t do a good job with the WarpScript.
Is it possible to validate these result with me ?
In your point of view, is my WarpScript is ok ?
Is the configuration of Warp10 is optimal for this kind of job ?

Or maybe it is not the good way to do the job.
Is it better to have more and smaller jobs to process ?


Thank you in advance for your answers or comments :-D 
Best regards,
Fabien.

-- 
You received this message because you are subscribed to the Google Groups "Warp 
10 users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/warp10-users/570577c5-6677-4205-bc72-3f6e109be9a9n%40googlegroups.com.

[warp10-users] Warpscript preformance

Reply via email to