On Tuesday 21 February 2006 15:44, Rudolf Cejka wrote: > Kern Sibbald wrote (2006/02/21): > > > One thread would read data and compute md5, the second thread would > > > just try to write data to the tape. > > > > This is already the case. Reading the data and computing the md5 sum is > > done in the FD, writing the data is done in the SD. Not only are they > > separate threads, but they are separate processes. > > Oops, I'm sorry - I wanted to say crc32 instead of md5, my fault. > I tried to measure time for read(), time for computing crc32 and > time for write() and computing crc32 was significant time delay > between reading and writing within speed about 80 MB/s on > Xeon 3.6 GHz.
Ah, yes, the crc32 is computed for each block by the SD just before writing the block to the tape. Interesting that it slows things down. It sounds like it is worth looking into. > > However now I'm again looking into sources and could find it anymore, > if there is secured full data stream by crc32, or just something or > nothing anymore. I'll look at it once more. Unless there is a bug, the crc32 is still computed and checked when reading back each block. > > The other point was in that one thread may just read and the second may > just write with some memory buffering and synchronization, which can > (or yes - can not too) help in feeding data to the tape. This is inspired > by star (ftp://ftp.berlios.de/pub/star/), which helped us with performance > tunning serveral times in the past. Yes, I have always planned to have a sort of circular pool of buffers that would be filled by the job threads by reading the data from the FD, then handed off to a single device that actually writes the buffer to the Volume. This would probably speed things up a good deal. > > > There are *many* ways to speed up the performance, both external to > > Bacula and internal to Bacula. For the moment, internal changes will have > > to wait for some contributor because I am still adding essential > > features. > > Yes - fully agree. I meant it mainly as that it is good to have > it widely known... > > > Maybe I am not understanding you correctly, but the above is not strictly > > true. Depending on how the user configures the SD, there can be one > > spooling area per device. > > I more meant "one spooling area per thread" - if I have allowed maximum N > parallel flows storing to bacula-sd, there would be interesting to have > 1 to N spool areas, where if despooling is in an action, particular > spool area would have an option, that spooling can be paused during this > time, so that spool area is reaserved just for reading or just for writing. Yes, perhaps this would help, but it seems to me a lot of complexity that will only benefit users with separate spool disks. Implementing a separate write thread as discussed above would *probably* give a good performance boost at low programming cost -- I say *probably* because performance enhancements can be tricky to the point of being counter productive if one is not careful to identify the real bottlenecks before changing things ... > > > This is not generally a problem worth changing since disks > > tend tend to be significantly faster than tapes ... > > The situation is changing since new tape drives have native speed > above 50 MB/s, whereas disks are still beaten by moves of heads > measured in miliseconds, so the real reading speed from disks > is not changing very much over time. > > > If one really wants tapes to write faster, someone might come up with a > > patch to the Bacula SD tape driver that uses overlapped I/O (I think this > > is sometimes called asynchronous I/O). > > I thouhgt about it too, however I not sure how much it would help. > I do not belive in OS, SW and HW so much in this way. After thinking about it a bit more, an asynchronous write to a tape is a bad idea and is better handled by the separate write thread. However, to get more data across a network, asynchronous socket writes, or using a multiple value write to reduce the current two socket writes per data item to one, could significantly increase the data rate across a fast network ... ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users