Short version: I need to load some fields from records into a big text thingy.
The code runs on the server-side only. I'm keen to preserve RAM. What are the trade-offs in V17 between *GOTO SELECTED* record and *SELECTION TO ARRAY*? I've been using *SELECTION TO ARRAY*, but it's hard to read, write, and maintain. And, I realized, might be de-optimized for memory because you have to load all of the data you're processing into arrays. (Yes, you can chunk it, but that doesn't change the fundamental point that you pre-load a lot of data.) Any test results or thoughts? I considered a fair range of option and did comparison tests on none. The long version below includes more details on the two solutions I'm down to, plus the ideas that I discarded. TL;DR version I'm working in V17 and I'm hoping that someone has done some real-world tests already that could help me out with a question. Here's the setup: I need to load up some fields from lots of records and push them into an external system. It's going to Postgres, but that's not an important detail, the result is a ginormous text object. The result could just as well be a text or JSON file dump. The main constraint is available memory. Performance matters when there are millions of records but, typically, the only important consideration is memory. As far as the final solution goes, it's ideally code that's easy to write, read, and maintain. As a plus, we can position the code to run server side, so client-server optimization isn't an issue. And, for the record, in lots of cases there isn't enough data to make memory an issue at all, so readable reliable code is definitely a preference. Note: Yes, I can chunk data in ranges, etc. to keep things within my memory footprint. I'm doing that....but the question still remains Here are the solutions I've come up with: *QUERY* and a *For* loop with *GOTO SELECTED RECORD*. Easy to read, write and maintain. But when you use *GOTO SELECTED RECORD*, do you get the whole record in V17? Without fat fields? Since this is server-side or stand-alone, should I care? On the upside, you're only loading one record at a time, so only burning through memory for that record while you use it. *SELECTION TO ARRAY* and a *For* loop This is what I have been doing....based on old habits as much as anything. Yes, you only get the columns you want, but it gets _all_ of the rows at once. So, you burn up a lot of memory with the arrays and then duplicate++ that memory when building up the output. On the code side, that kind of *SELECTION TO ARRAY*-loop-read by index code is ugly, tedious to write, and tedious to maintain. It's clear(ish) and reliable, but only worth it if it pays for itself somehow. In other words, it has to be a good deal better than *GOTO SELECTED RECORD* to be worth it. Says the guy who has been doing all *SELECTION TO ARRAY* forever. Entity Selection and a *For* or *For each* loop I have no clue why an entity selection is *C_OBJECT* instead of *C_COLLECTION*, to give you a sense of how much I know about this stuff. I was happy to discover that you can easily create an entity selection from a current selection, so old style queries work fine: *C_OBJECT*($stuff_es) *QUERY*([Stuff];[Stuff]Counter>=10000) $stuff_es:=*Create entity selection*([Stuff]) The resulting *For*/*For each* loop code is very readable, it's == *GOTO SELECTED RECORD*, but with a different syntax. Otherwise, same same. I *suspect* that the memory use here is excellent. I'm guessing that as you navigate through the entity selection, you're only really pulling the data you use. But maybe not. If you do a For each, you get an object (entity) with all of the fields. So, possibly this is approach is even worse than *GOTO SELECTED RECORD* which, I'm guessing, doesn't load as many fields. I haven't tested these points out in any way. If anyone has dug into this, it would be great to know about the difference (if any) in what 4D loads when you: -- Use *GOTO SELECTED RECORD* -- Use a *For each* loop on an entity selection, which builds an $entity_object which you can then read/write to/from like $entity_object.ID -- Use a *For* loop on an entity selection and then reference $specifc_es[0].ID It's pretty easy to imagine different ways that 4D might have implemented things that are more or less efficient in each of these days. I have no idea what they actually did.I'm kind of curious about this behavior in V17, but have already talked myself out of using entity selections. Why? Because the table and field references are brittle and *case-sensitive*. Man, I truly hate case-sensitive names. When do I want them? Never. Not once, and I never will. This isn't all on 4D, many languages are case-sensitive. It makes sense if you're a computer. I'm not a computer, I'm a person...to me its just horrible. Anyway, not exclusively a 4D problem...because in 4D you can avoid it altogether. For those that haven't been following along at home, here's a hello world level V17 For each loop over an entity selection: *C_OBJECT*($stuff_object) *For each* ($stuff_object;$stuff_es) // The loop automatically populates $stuff_object as it iterates through the list. $output_text+output_text+$stuff_object.ID+*Char*(*Carriage return*) *End for each* See that $stuff_object.ID statement? The ID part = [Stuff]ID. It's all case-sensitive. Rename the field in the structure to id three months from now and the code above breaks. And for "breaks", you don't get a compiler error, you don't get a syntax error in the Method Editor, and you likely don't get a runtime error. You code just screws up silently. So, yeah, not going that way. *Note*: Collections are very handy when the source data is a big static JSON. It makes the static values highly interactive. I wrote a little screen like that last week and loved the results. *Note*: In a *For each* loop, I can't find a way to read the index of the current item. Like, that you're on item 23. You can get the total item count with .length, but I see no way to get the current index. Or on collections. It can be useful when you've got a progress indicator to update. You can always roll your own $index:=$index+1 sort of thing. Reminder: All of the new V17 stuff is 0 (offset) indexed, not 1 (position) indexed. Honorable mention: *Selection to JSON* Yeah, kind of nice...a very excellent command in some situations. In this case, wildly wrong, I'd say. You load the whole JSON in one go so you get your source data + formatting + names. It's pretty flabby. Then you have to parse and walk that to get the proper text. If 4D had a Selection to text (->Table;Template) system that was *not* JSON, I'd be golden. That would be perfect. The *Selection to JSON* code doesn't allow in-line functions, so there's that. Oh, wait, 4D does have a command like this...*PROCESS 4D TAGS*. Hmmm. Yeah, probably the best approach for memory and the worst for brittleness. Not going there. Okay, so does anyone have any relevant, V17-based test results yet? I don't have the time or appetite to do the tests myself and won't be surprised if no one else has either. Not to be a **** about it, but I'm only interested in *test results*. It's fun to estimate program behavior from first principles, but it has pretty much zero predictive value. Having just spewn out a bunch of speculation, I certainly can't hold it against anyone else for riffing too. I've spent some embarrassing number of hours (for hours read "months") of my life testing 4D performance and, well, you have to test to find out. Conventional wisdom tends to be *worse* than random guessing. It's great to hear theories and stories from the folks at 4D, but that's all they are...stories and theories. Background information can give you a better idea of what to test and where to look, but that's all. Modern machines + modern OS + 4D + your code + all of the various subcomponents (RAM, network, SSD)...it's a lot. So, it's not a criticism to say only testing can hope to turn up meaningful results. Given all of those factors, narrow test are ideal and obviously can't be generalized too far. Still, lots better than speculation! Thanks. ********************************************************************** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **********************************************************************