Re: [rrd-users] Fetch the time of the first entry
Steven Sim unixan...@outlook.com wrote: I'm being asked to write a scrip to append data to an rrd and plot daily, weekly and monthly graphs of said data. I already have a script to create an rrd and write data to it. To append, I'll simply bypass the 'creation' logic of my script and use a user specified rrd file. It may be just the way you write, but the wording makes me wonder if you properly understand how RRD works. Because in RRD, there is no such function as appending data - only updating. You specify the number of consolidated data points to keep at the point of creation, after that the size never changes and you only update values. ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
Simon; Thanks for pointing this out! I intend to process a data file each day but create the rrd database on the FIRST day of the month. The creation shall ensure sufficient data points in the first day to contain the entire month but plot daily graphs until the end of the month, whereby it will plot the entire month. Say for example, If i had a step interval of 15 minutes (900 seconds), rrdtool create $RRDB --step 900 --start $STARTIME \ DS:.. DS:.. . . . DS:.. DS:.. RRA:AVERAGE:0.5:1:2880 2880 = 24 hours x 4 data points per hour x 30 (thirty days in a month) 4 data points since one hour has 4 data points (15 minutes step). Would the above be correct? Deepest Regards Steven Sim On Tue, Jul 1, 2014 at 3:11 PM, Simon Hobson li...@thehobsons.co.uk wrote: Steven Sim unixan...@outlook.com wrote: I'm being asked to write a scrip to append data to an rrd and plot daily, weekly and monthly graphs of said data. I already have a script to create an rrd and write data to it. To append, I'll simply bypass the 'creation' logic of my script and use a user specified rrd file. It may be just the way you write, but the wording makes me wonder if you properly understand how RRD works. Because in RRD, there is no such function as appending data - only updating. You specify the number of consolidated data points to keep at the point of creation, after that the size never changes and you only update values. ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
Steven Sim unixan...@outlook.com wrote: I intend to process a data file each day but create the rrd database on the FIRST day of the month. The creation shall ensure sufficient data points in the first day to contain the entire month but plot daily graphs until the end of the month, whereby it will plot the entire month. Say for example, If i had a step interval of 15 minutes (900 seconds), rrdtool create $RRDB --step 900 --start $STARTIME \ DS:.. DS:.. . . . DS:.. DS:.. RRA:AVERAGE:0.5:1:2880 2880 = 24 hours x 4 data points per hour x 30 (thirty days in a month) 4 data points since one hour has 4 data points (15 minutes step). Would the above be correct? Yes, your numbers are correct, but I wonder about your methodology. It might help if you said what you are trying to achieve, because it's a rather unusual way of using RRD. Normally, you simply create one RRD that collects, stores, and consolidates the data you want. Typically this means keeping high resolution data for a shortish time, and keeping progressively lower resolution for progressively longer times - eg most applications don't need to keep (say) 5 minute resolution traffic data for 2 years ago. That's not to say you can't keep several years worth of high-resolution data if you want to. By keeping separate RRD files for each 'month', you'll find that it's hard work if you later want to do a graph for a year ! You'd also need to lock your filled data files otherwise one update to the wrong file could wipe out all the data. ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
Simon; Thanks for your very swift reply. I'm still digesting your reply but I do have a small query. my rrdtool first rrdfile command keeps returning different timing after each rrdtool update. Why is that so? Shouldn't it ALWAYS return the Unix time stamp for the first entry? Deepest Regards Steven Sim On Tue, Jul 1, 2014 at 7:43 PM, Simon Hobson li...@thehobsons.co.uk wrote: Steven Sim unixan...@outlook.com wrote: I intend to process a data file each day but create the rrd database on the FIRST day of the month. The creation shall ensure sufficient data points in the first day to contain the entire month but plot daily graphs until the end of the month, whereby it will plot the entire month. Say for example, If i had a step interval of 15 minutes (900 seconds), rrdtool create $RRDB --step 900 --start $STARTIME \ DS:.. DS:.. . . . DS:.. DS:.. RRA:AVERAGE:0.5:1:2880 2880 = 24 hours x 4 data points per hour x 30 (thirty days in a month) 4 data points since one hour has 4 data points (15 minutes step). Would the above be correct? Yes, your numbers are correct, but I wonder about your methodology. It might help if you said what you are trying to achieve, because it's a rather unusual way of using RRD. Normally, you simply create one RRD that collects, stores, and consolidates the data you want. Typically this means keeping high resolution data for a shortish time, and keeping progressively lower resolution for progressively longer times - eg most applications don't need to keep (say) 5 minute resolution traffic data for 2 years ago. That's not to say you can't keep several years worth of high-resolution data if you want to. By keeping separate RRD files for each 'month', you'll find that it's hard work if you later want to do a graph for a year ! You'd also need to lock your filled data files otherwise one update to the wrong file could wipe out all the data. ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
Steven Sim unixan...@outlook.com wrote: my rrdtool first rrdfile command keeps returning different timing after each rrdtool update. Why is that so? Shouldn't it ALWAYS return the Unix time stamp for the first entry? It's not a function I've used ... The docs say it should return the first value entered. Once you've filled a data set, then it's going to give you the timestamp of the oldest available value - I'm not sure what it gives you for an RRD file that hasn't been filled. Here I'm using the term filled to mean you done enough updates that the whose database (or at least data series) has been filled with data and you are now losing older data to make way for the new data. Have you looked at how the value is changing ? Does it by any chance advance at the same rate as the timestamp of your updates ? ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
Simon; Something is seriously wrong and I don't know what it is. My Perl script parses the data file just fine. It plots just fine. The legends are correct The dates are correct. BUT when I use rrdtool first rrd database in an attempt to get the time of the first entry, I get a Unix time stamp which is one entire month EARLIER than the first entry in the file. And each time I do update, rrdtool first rrd_database returns a DIFFERENT number. The rrd database is created with an RRA sufficient to contain an entire month with readings taking every 15 minutes (900 seconds). I would appreciate any suggestions. Deepest Regards Steven Sim On Tue, Jul 1, 2014 at 8:23 PM, Simon Hobson li...@thehobsons.co.uk wrote: Steven Sim unixan...@outlook.com wrote: my rrdtool first rrdfile command keeps returning different timing after each rrdtool update. Why is that so? Shouldn't it ALWAYS return the Unix time stamp for the first entry? It's not a function I've used ... The docs say it should return the first value entered. Once you've filled a data set, then it's going to give you the timestamp of the oldest available value - I'm not sure what it gives you for an RRD file that hasn't been filled. Here I'm using the term filled to mean you done enough updates that the whose database (or at least data series) has been filled with data and you are now losing older data to make way for the new data. Have you looked at how the value is changing ? Does it by any chance advance at the same rate as the timestamp of your updates ? ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
Steven Sim unixan...@outlook.com wrote: BUT when I use rrdtool first rrd database in an attempt to get the time of the first entry, I get a Unix time stamp which is one entire month EARLIER than the first entry in the file. And each time I do update, rrdtool first rrd_database returns a DIFFERENT number. With each update, does the FIRST value update to be one month (or 30 days) before the timestamp of the update ? If so then I think I know what's happening. When you create teh RRD file, it is created in it's entirety - in your case with buckets for 2880 consolidated values. These exist regardless of what updates you do or do not do, and the timespan of them is determined by your step and consolidation values. What I suspect is happening, and what I alluded to earlier, is that even though you haven't done any updates for those historical buckets, they are still there - and FIRST is merely reporting the timestamp of the oldest bucket. Since you are setting the start time of the RRD when you create it, then the timestamp of the oldest bucket will be 30 days prior to that. As you perform updates, you overwrite the oldest buckets and the value of FIRST will advance to be 30 days prior to the last update. Internally to RRD, there is no storage for whether a bucket actually had any updates - only for it's value after applying the consolidation rules specified. Thus there is no way to know if the oldest bucket ever had an update if it's value is NaN - you don't know if it was created with NaN and has never been overwritten, or if it was updated but the calculated value was NaN. You would have to search back through the database to find the oldest bucket with an actual value and infer that this bucket was *probably* the one with the oldest data. ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
Simon; Something is seriously wrong and I don't know what it is. My Perl script parses the data file just fine. It plots just fine. The legends are correct The dates are correct. BUT when I use rrdtool first rrd database in an attempt to get the time of the first entry, I get a Unix time stamp which is one entire month EARLIER than the first entry in the file. As already explained, rrdtool first gives you the first available slot. Maybe you want to look at the data provided by the VDEF function FIRST. Return the last/first non-nan or infinite value for the selected data stream, including its timestamp. Example: VDEF:first=mydata,FIRST If you rrdtool graph without actually using any graphing elements, and the PRINT (not GPRINT) this value's time component, you can use it in a script. HTH Alex ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
Simon; Thanks deeply pal. So what you and Alex van den Bogaerdt is saying is that FIRST does NOT give the first actual 'update' but the actual first available slot, which is 30 days prior to the last update. And yes, the FIRST value changes in step with each of my actual 'update', so it really gives credence to your explanation. One questions; When I create the RRD, I am forced to give the --start TIME as one step BEHIND my actual update. So let's say my data starts at midnight 1st July, I am forced to create a RRD with --start time at 11:45 30th of June. data starts - 1st July 00:00:00 -- first data entry. Step = 15 minutes (900 seconds) RRD has to be created with --start 11:45 30th of June (one step behind). if I do not create it 1 step behind, I get an error with my updates. Why is the above so? Secondly, Thanks to Alex can den Bogaerdt, I should use the VDEF function and use rrdtool graph without graphing and PRINT to parse the value to a Perl script variable. Deepest Regards Steven Sim On Tue, Jul 1, 2014 at 9:53 PM, Simon Hobson li...@thehobsons.co.uk wrote: Steven Sim unixan...@outlook.com wrote: BUT when I use rrdtool first rrd database in an attempt to get the time of the first entry, I get a Unix time stamp which is one entire month EARLIER than the first entry in the file. And each time I do update, rrdtool first rrd_database returns a DIFFERENT number. With each update, does the FIRST value update to be one month (or 30 days) before the timestamp of the update ? If so then I think I know what's happening. When you create teh RRD file, it is created in it's entirety - in your case with buckets for 2880 consolidated values. These exist regardless of what updates you do or do not do, and the timespan of them is determined by your step and consolidation values. What I suspect is happening, and what I alluded to earlier, is that even though you haven't done any updates for those historical buckets, they are still there - and FIRST is merely reporting the timestamp of the oldest bucket. Since you are setting the start time of the RRD when you create it, then the timestamp of the oldest bucket will be 30 days prior to that. As you perform updates, you overwrite the oldest buckets and the value of FIRST will advance to be 30 days prior to the last update. Internally to RRD, there is no storage for whether a bucket actually had any updates - only for it's value after applying the consolidation rules specified. Thus there is no way to know if the oldest bucket ever had an update if it's value is NaN - you don't know if it was created with NaN and has never been overwritten, or if it was updated but the calculated value was NaN. You would have to search back through the database to find the oldest bucket with an actual value and infer that this bucket was *probably* the one with the oldest data. ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
Alex; Er ... I cannot seem to get the syntax correct.. Can you advice? rrdtool graph - VDEF:FIRST:myrrd.rrd:CPUBusy:FIRST where CPUBusy is the defined DS myrrd.rrd is the RRD database file itself.. The above keeps $ rrdtool graph - VDEF:FIRST=myrrd.rrd:CPUBusy:FIRST ERROR: Cannot parse line 'VDEF:FIRST=myrrd.rrd.rrd:CPUBusy:FIRST' Deepest Regards Steven Sim On Tue, Jul 1, 2014 at 10:08 PM, Alex van den Bogaerdt a...@vandenbogaerdt.nl wrote: Simon; Something is seriously wrong and I don't know what it is. My Perl script parses the data file just fine. It plots just fine. The legends are correct The dates are correct. BUT when I use rrdtool first rrd database in an attempt to get the time of the first entry, I get a Unix time stamp which is one entire month EARLIER than the first entry in the file. As already explained, rrdtool first gives you the first available slot. Maybe you want to look at the data provided by the VDEF function FIRST. Return the last/first non-nan or infinite value for the selected data stream, including its timestamp. Example: VDEF:first=mydata,FIRST If you rrdtool graph without actually using any graphing elements, and the PRINT (not GPRINT) this value's time component, you can use it in a script. HTH Alex ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
Steven Sim unixan...@outlook.com wrote: When I create the RRD, I am forced to give the --start TIME as one step BEHIND my actual update. So let's say my data starts at midnight 1st July, I am forced to create a RRD with --start time at 11:45 30th of June. data starts - 1st July 00:00:00 -- first data entry. Step = 15 minutes (900 seconds) RRD has to be created with --start 11:45 30th of June (one step behind). if I do not create it 1 step behind, I get an error with my updates. This came up recently. In effect, the START parameter sets the last update time for the RRD, as you can't have two updates with the same time, you need to set the start parameter to before the timestamp of your first update. In principal the actual value doesn't matter - it could even be 0. However, if it's a long time ago then on your first update, RRD will have to roll forward and process all the non-existant updates to get up to date. How this came up was someone found that if they had no updates for a few days (machine down) then they found the first update took a lot longer than normal. ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
Alex van den Bogaerdt a...@vandenbogaerdt.nl wrote: As already explained, rrdtool first gives you the first available slot. Looks like the documentation is misleading then : http://oss.oetiker.ch/rrdtool/doc/rrdfirst.en.html says : The first function returns the UNIX timestamp of the first data sample entered into the specified RRA of the RRD file. Can't just come up with the right words though. It's the timestamp of the oldest bucket, or is CDP slot a better term ? ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Re: [rrd-users] Fetch the time of the first entry
You seem to be misunderstanding how RRDTool works. Remember that, after every update of an RRA, the oldest value is thrown away to make room for the new one. So, the 'first' time (the timestamp of the oldest bucket in the RRA) will increase every time you update, as the last bucket is thrown away. Also, the 'first' value is the oldest bucket, *whether or not it has been filled*. So, if the RRA is 1 month long, then 'first' will be 1 month before the last update time, even if you have only updated once, since the buckets are implicitly created. Remember RRDTool is not like an Oracle database -- it is not an every-increasing list of updates that starts at size 0 and gets constantly bigger. Also, this is how to use a VDEF to get the value of the first 'foo' item in the foo.rrd, using the most appropriate Average RRA for the timewindow: DEF:foods:foo.rrd:foo:AVERAGE VDEF:firstfoo:foods,FIRST PRINT:foo Steve Steve Shipway University of Auckland ITS UNIX Systems Design Lead s.ship...@auckland.ac.nzmailto:s.ship...@auckland.ac.nz Ph: +64 9 373 7599 ext 86487 ___ rrd-users mailing list rrd-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users