Hi Miles,

I wrote a little iPhone app to test loading the standard UNIX dictionary (/usr/share/dict/web2, 234,936 words). If you'd like, you can download the XCode 3.1.x project from here:

http://www.restlessbrain.com/DictTest.zip

I don't actually have an iPhone, so I only tested it on the simulator. I tried 3 kinds of files:

a) the dictionary stored as a txt file, UTF-8 encoded (2.4 MB).
b) the dictionary stored as a xml plist (6.4 MB).
c) the dictionary stored as a bin plist (3.7 MB).

(I created (b) and (c) from (a) by using a text editor to wrap each line with the appropriate <string></string> tags (thank you grep!), then saved the resulting file twice, once as xml and again as bin, using Property List Editor.app)

The results (again, on the simulator and on my machine) are shown in the screen shot inside the project directory (Picture 1.png) and are that there's little difference between txt (0.27 sec) and bin (0.29 sec), but xml takes about twice as long (0.64 sec). Of course, to get a statistically relevant result, you should change the code to reload several times and take the average value for each file, but what I did is enough to get an idea of the times involved.

I am curious what the times would be on the actual device. If you give it a try, please let me know.

Hope this helps.
Wagner

On Apr 14, 2009, at 8:12 PM, Miles wrote:

[This is sort of in continuation of the thread "Build Settings for Release:
App/Library is bloated", which gradually changed topics.]
I'm trying to find the best way to load in a 2MB text file of dictionary
words and be able to do quick searches.

Simply loading the uncompressed txt file takes about 0.5 seconds which I can handle. But when I used the following to create an array of the words from
the file:
  NSArray *lines = [stringFromFileAtPath componentsSeparatedByString:@
"\n"];

... it took about 13 seconds, which is way too long.


I'm not super concerned about the 2MB of disk space the txt file takes up, although I wouldn't be mad about decreasing it somehow. And once I get the whole dictionary in an array, the searches are basically fast enough for my purposes. I've still been reading up on Huffman encoding if I decide to try to compress this. However, my main issue now is loading time, and it seems
like this won't help me there.

And, I'm looking into creating a Trie (which is where the previous thread guided me), although I'm not sure this helps my current issue of loading time either. I'm thinking that creating a Trie will probably take just as
long, or longer, than simply splitting the file using
'componentsSeparatedByString', right? So, is there some way to store the trie on disk so that the loading is my final data structure is faster? What
other options do I have to speed this up?

Thanks!

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to