On Oct 7, 2009, at 10:12 PM, David Melgar wrote:

Hello,
I didn't mean to state threads as a requirement when I said "async", I just meant some way to get partial results, such as a call to a delegate I referenced in the previous note. And I'm certainly not seeking complexity of threads if I don't need it.

There isn't a simple way to do this. There is, however, a simple way to make your queries (in both Core Data, and direct SQLite) 100x faster.

In my case, == is not the query I need. I really want case insensitive match as I presumed LIKE would do, or better yet, 'LIKE foo%'.

Great. The Derived Property sample project on ADC shows you exactly what you need to do for replacing LIKE "foo" or LIKE "foo%"

In the meantime, I've just completed a test using SQLite.
The query in Coredata using 'LIKE foo%bar' across roughly 8 million records didn't return anything until the query was done, and the query took 2min 14sec,

This runs through ICU, and cannot use an index. It can trivially be made 100x faster, at least for equality, prefix, and suffix matching.

What's the actual SQL Core Data logs ?

Doing what I interpreted to be equivalent using SQLite across 20 million records returned in 11 seconds

What was the actual SQL you believed equivalent ?

and I was able to get my first result almost immediately without needing threads. They provide a callback scheme, analogous to calling a delegate in Cocoa. FYI, this is across 20 different databases, otherwise it may have been faster.

For what I'm doing, SQLite seems much faster, smaller footprint and easier to comprehend. I don't understand what I'm losing.

In your note you mention the cost of doing unicode aware regex and sorting. As far as I know, I don't need any of those, at least not yet.

All of your data is 100% 7 bit ASCII ? None of your customers will be Asian, European, or Hispanic ? None of your English customers will expect Unicode data that they copy and paste into your application from Safari to be preserved correctly ?

If your answer is no, then you'll need to implement your own string matching operations for SQLite using either CFString or ICU. SQLite's built in operators use memcmp(). Once you've done that, I'd love to see the performance results.

I took a brief look at the derived property example. Seems complex.

What about the example seems complex ? Everything you need is in 1 file, that's got 107 lines of code. You can just copy and paste that code. Make a searchText column that has preprocessed the text data into a simpler form that can be used with an index and a simpler query.

You say you want to do a case insensitive search across 20 million rows. You could absorb the cost of that, relative to a simple binary bitwise compare, for each of the 20 million rows. And pay it again every time you execute a query. Or, you could store a preprocessed column, and absorb that cost for just 1 string, the current search term "foo"

Perhaps for ASCII case insensitivity, that doesn't seem particularly interesting. But once you start dealing with real world text encodings, which every Mac OS X and iPhone OS customer expects from every app, this becomes a big deal. Unicode is complex and difficult to work with.

Something the framework should give me an option to handle rather than me having to generate such complex code. Does it advocate creating another field in the database as a normalized version of the field I really want to search on?

Yes.

Why would this be better than using SQLite directly?


Doing this in SQLite directly will also be 100x faster than what you're doing now. Regardless of whether or not you decide to use Core Data, this would be better.

- Ben

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to