On Tue, Apr 14, 2009 at 2:12 PM, Miles <vardpeng...@gmail.com> wrote: > [This is sort of in continuation of the thread "Build Settings for Release: > App/Library is bloated", which gradually changed topics.] > I'm trying to find the best way to load in a 2MB text file of dictionary > words and be able to do quick searches. > > Simply loading the uncompressed txt file takes about 0.5 seconds which I can > handle. But when I used the following to create an array of the words from > the file: > NSArray *lines = [stringFromFileAtPath componentsSeparatedByString:@ > "\n"]; > > ... it took about 13 seconds, which is way too long. > > > I'm not super concerned about the 2MB of disk space the txt file takes up, > although I wouldn't be mad about decreasing it somehow. And once I get the > whole dictionary in an array, the searches are basically fast enough for my > purposes. I've still been reading up on Huffman encoding if I decide to try > to compress this. However, my main issue now is loading time, and it seems > like this won't help me there.
For best loading time, you want a file format which can be loaded on-demand, instead of all at once up front. Designing your own such format is highly non-trivial, and I don't recommend doing that at least until you're at the point where you're ready to ignore recommendations from the likes of me. Sqlite has this property and would be a good choice of storage format if that's what you're after. The downside is, of course, that the per-query time goes up, as a tradeoff. If you can stand queries taking longer (but still being individually instantaneous from the user's point of view) in exchange for nearly zero load time, this is a good way to go. If you really do want to load everything up front, my recommendation would be to do as much parsing as possible on the C side of things before you move over to Cocoa. Rather than load the entire file into an NSString and then split it up from there, read the raw bytes, search for \n directly, and then load the individual lines into NSStrings. NSString has a lot of fancy capabilities like Unicode awareness that you simply don't need for this, and which will cost you a lot. Using an existing format with existing optimizations, like Apple's binary plist format, could also be a good way to go here. Mike _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com