Hmmm right, valid points.

The queries I'm doing are on slices of data that are preselected using
indices, and then right now I'm post-filtering them in the application, and
I was just wondering if I could already do better filtering on the db side
before paying the serialization costs.

In fact, I'm not facing performance issues right now and I'm more idly
musing in order to better know the tools at my disposal.

I might also be lobbying for a JSON1 function that extracts keys from an
object, if that would make sense.

Wout.


On Thu, Mar 28, 2019 at 7:50 PM Warren Young <war...@etr-usa.com> wrote:

> On Mar 28, 2019, at 4:15 AM, Wout Mertens <wout.mert...@gmail.com> wrote:
> >
> >   - I don't see how json_type can help
>
> I don’t see “json_type” in this thread at all, other than this message.
>
> >   - Schemaless data is really nice to work with
>
> Sure, but it has a cost.  Unless you’re willing to give us a *lot* more
> information, you’ll have to decide if you’re willing and able to pay it,
> given your application constraints.
>
> By “more information,” I mean to a level equivalent to “hire one of us as
> a consultant on your project.”  We’d need full schema info, number of rows,
> queries per second stats, time-to-answer budgets, representative sample
> data…
>
> > the wrapper I use does
> >   allow putting parts of the JSON object into real columns but changing
> the
> >   production db schema all the time isn't nice
>
> You only have to change the DB schema each time you discover something new
> you want to index.  If you don’t even know yet what you need to index, how
> can you expect us to tell you, especially given how thin the information
> you’ve provided is?
>
> >   - I suppose I was hoping for some performance discussion of the
> queries,
>
> I gave you performance information based on my data, in my schema, with my
> queries.  You’ve given us your queries but no data and a faux schema, so
> naturally no one’s dissected your queries’ performance.
>
> Despite Jens’ objection, I’ll stand by my observation that since you don’t
> show any indices, we must assume that your queries are full-table scans,
> which in this case involves re-parsing each JSON object along the way.
>
> >   perhaps how to implement it using json_each?
>
> How would that solve any performance problem?  It’s still a full-table
> scan, lacking an index.
>
> I guess this is coming from the common belief that it’s always faster to
> put the code in the database query, as opposed to doing it in the
> application code, but that’s only true when the DB has more information
> than you do so it can skip work, or because doing the processing at the DB
> level avoids one or more copies.  I’m not seeing that those apply here.
>
> “Put it in the database” can also avoid a lot of IPC overhead when using a
> client-server DB, but that cost isn’t one that happens with plain SQLite.
>
> >   - I'm thinking it would be nice if the JSON1 extension had a function
> to
> >   extract object keys as an array.
>
> If you don’t even know what keys you need to operate on until you see
> what’s available in each record, I’d say most of your processing should be
> at the application code level anyway.  And in that case, I’d tell you to
> just pull the JSON data as a string, parse it in your program, and iterate
> over it as necessary.
>
> SQL is meant for declarative queries, where you say “I need thus-and-so
> data,” which you can specify precisely.  It sounds like you cannot specify
> that query precisely, so it should probably be done with application logic.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to