Pawel (privately) wrote:
Or at least, you THINK they do ;-)Hi, That's not very efficient for record key calculations I am afraid, even though conversions are very fast (relatively) on jBASE 4.1, and then you have to call the subroutine in the first place of course.Distribution algorithm is very simple - it uses part of date (day) contained in key to distribute records. So we have 32 partfiles. IDs not matching some pattern (say 1A5N), are put to partfile 32. For these matching pattern there is only 1 invocation of ICONV and OCONV. Well, obviously, it must. It does not know how you are calculating the item ids and yet it must traverse every item in order to test the criteria.Day is obtained from date and returned as partfile number. Procedure can not be simpler (few lines) I think :) The performance problem arises when you ask for data with selection criteria. jBASE will start to call distribution subroutine thousands of times. Well, if you get the same order for the file as you do if you LIST each individual part file, then you could be.This will introduce enomours overhead. We usually do not need to ask queries like that, but for some (CSHD) investigations we are forced to do it like that. How can it do otherwise? The list must be the list of record keys.The main difference is that jBASE runs distribution routine for these "full scan" selects and I can not understand why does it need to do it? Yep. And because it is a calculated key, it probably isn't using the fastscan interface so performance will be very low in comparison.I guess that SELECT / READNEXT operations of jEDI driver implemented for distributed files are virtually handling distribution (so SELECT program is not aware of partfiles), Yes - that is what I said you should be doing. Then there are specific routines you can use to merge lists. Or you can wait for my new file system and not bother with the distribution as you won't need it ;-)but just performs SELECT / READNEXT + READ of record. This is inefficient, because READ introduces unnecessary overhead caused by calling distribution routine. Results can be obtained much faster by doing (direct) SELECTs on partfiles and combining output. No - we optimized for the general case, but if you are going to take over the key (or rather partition selection), there is nothing to be done but ask you for it.This is however optimization for jBASE team, I think that the ticket isn't correct is probably the reason. This is what you get from the partitions as it stands. You should probably raise a ticket that steps back and asks for advice on choosingnot us I belive. We already raised it, but I noticed "resistance" in accepting this ticket :(
--~--~---------~--~----~------------~-------~--~----~ Please read the posting guidelines at: http://groups.google.com/group/jBASE/web/Posting%20Guidelines IMPORTANT: Type T24: at the start of the subject line for questions specific to Globus/T24
To post, send email to [email protected]
|
- jBASE unefficient? - distributed files Pawel (privately)
- Re: jBASE unefficient? - distributed files Jim Idle
- Re: jBASE unefficient? - distributed files CLIF
- Re: jBASE unefficient? - distributed files Pawel (privately)
- Re: jBASE unefficient? - distributed files Jim Idle
- Re: jBASE unefficient? - distributed fi... Pawel (privately)
