Re: [sqlite] [EXTERNAL] Indexing multiple values per row

2018-03-16 Thread Richard Hipp
On 3/16/18, Jens Alfke  wrote:

> Yes, the solution involves creating a table as you described, but the table
> is derived from and dependent on the original JSON data, and needs to be
> kept up to date as that data changes. Triggers are the best way I know of to
> do that, but I'm open to other ideas.

Perhaps use FTS4 or FTS5 with a custom tokenizer that understands JSON?
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] Indexing multiple values per row

2018-03-16 Thread Jens Alfke


> On Mar 16, 2018, at 1:24 AM, Hick Gunter  wrote:
> 
> So in your application you don't care about when a temperature was measured? 
> Creating a table temps (patient_id, timestamp, temperature) with an index on 
> (timestamp,temperature) would yield a fast way to access patients with 
> elevated temperatures within a time frame.

That was just an example. I work on a general-purpose database layered on 
SQLite, whose data/document format is JSON, and which allows arbitrary queries 
of that JSON. The question is how to efficiently index values that occur inside 
arrays (or objects), i.e. where there are multiple values to be indexed per 
row. 

Yes, the solution involves creating a table as you described, but the table is 
derived from and dependent on the original JSON data, and needs to be kept up 
to date as that data changes. Triggers are the best way I know of to do that, 
but I'm open to other ideas.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] Indexing multiple values per row

2018-03-16 Thread Wout Mertens
True, I mean in case there were multiple dimensions. I should write down
hidden thinking processes more.

On Fri, Mar 16, 2018 at 9:58 AM Simon Slavin  wrote:

> On 16 Mar 2018, at 8:38am, Wout Mertens  wrote:
>
> > It seems to me that the only real option is to maintain a derived table,
> > the table could even be R*TREE to allow range queries.
>
> You can always do range queries on tables in SQLite.  Not sure what you
> mean by the above.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] Indexing multiple values per row

2018-03-16 Thread Simon Slavin
On 16 Mar 2018, at 8:38am, Wout Mertens  wrote:

> It seems to me that the only real option is to maintain a derived table,
> the table could even be R*TREE to allow range queries.

You can always do range queries on tables in SQLite.  Not sure what you mean by 
the above.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] Indexing multiple values per row

2018-03-16 Thread Wout Mertens
I have the same problem but luckily O(n) performance is fast enough for me
right now.

It seems to me that the only real option is to maintain a derived table,
the table could even be R*TREE to allow range queries.

Triggers seem to be the accepted way to derive tables, but I'm a little
afraid of them not always firing (sqlite versions, configuration glitches,
whatever reason), which would cause the table to be incorrect.

On Fri, Mar 16, 2018 at 9:24 AM Hick Gunter  wrote:

> So in your application you don't care about when a temperature was
> measured? Creating a table temps (patient_id, timestamp, temperature) with
> an index on (timestamp,temperature) would yield a fast way to access
> patients with elevated temperatures within a time frame.
>
> Other than that, using triggers is probably the easiest way to go.
>
> -Ursprüngliche Nachricht-
> Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> Im Auftrag von Jens Alfke
> Gesendet: Donnerstag, 15. März 2018 18:34
> An: SQLite mailing list 
> Betreff: [EXTERNAL] [sqlite] Indexing multiple values per row
>
> I'm wondering what the best way is to efficiently search for data values
> that can appear multiple times in a table row. SQLite indexes, even
> expression indexes, don't directly work for this because they obviously
> only index one value per row. Traditional relational-database design says
> to normalize the schema by storing the multiple values in separate rows,
> but what if the data to be indexed is JSON and you need to leave it in that
> form?
>
> For example, let's say I have a table like
> patient_id: 12345
> temps: "[98.6, 99.1, 101.3, 100.0, 98.9]"
> and I want to run queries on temperature data, like 'patients who've had a
> temperature above 101'. And I need better than O(n) performance.
>
> In the past my project used map/reduce to support this, essentially
> implementing its own index system on top of SQLite tables. In this case it
> would create a table (patient_id integer primary key, temp number) and
> populate it by scanning the patient table. This can obviously be indexed
> easily, but updating the table before a query when the source table has
> changed is a pain in the butt.
>
> I believe a SQL "materialized view" would do what I want, but SQLite
> doesn't support those; its views seem to be just syntactic sugar or macros
> around SELECT queries.
>
> Other than that, my best idea so far is to simplify the map/reduce
> updating by adding triggers on the source table that will add & remove rows
> from the index table.
>
> Is there any other clever way I could do this? (I've been using SQLite for
> 14 years and I still keep learning about more clever things it can do…)
>
> —Jens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___
>  Gunter Hick | Software Engineer | Scientific Games International GmbH | 
> Klitschgasse
> 2 -4,
> A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100
> <+43%201%2080100> - 0
>
> May be privileged. May be confidential. Please delete if not the addressee.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] Indexing multiple values per row

2018-03-16 Thread Hick Gunter
So in your application you don't care about when a temperature was measured? 
Creating a table temps (patient_id, timestamp, temperature) with an index on 
(timestamp,temperature) would yield a fast way to access patients with elevated 
temperatures within a time frame.

Other than that, using triggers is probably the easiest way to go.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Jens Alfke
Gesendet: Donnerstag, 15. März 2018 18:34
An: SQLite mailing list 
Betreff: [EXTERNAL] [sqlite] Indexing multiple values per row

I'm wondering what the best way is to efficiently search for data values that 
can appear multiple times in a table row. SQLite indexes, even expression 
indexes, don't directly work for this because they obviously only index one 
value per row. Traditional relational-database design says to normalize the 
schema by storing the multiple values in separate rows, but what if the data to 
be indexed is JSON and you need to leave it in that form?

For example, let's say I have a table like
patient_id: 12345
temps: "[98.6, 99.1, 101.3, 100.0, 98.9]"
and I want to run queries on temperature data, like 'patients who've had a 
temperature above 101'. And I need better than O(n) performance.

In the past my project used map/reduce to support this, essentially 
implementing its own index system on top of SQLite tables. In this case it 
would create a table (patient_id integer primary key, temp number) and populate 
it by scanning the patient table. This can obviously be indexed easily, but 
updating the table before a query when the source table has changed is a pain 
in the butt.

I believe a SQL "materialized view" would do what I want, but SQLite doesn't 
support those; its views seem to be just syntactic sugar or macros around 
SELECT queries.

Other than that, my best idea so far is to simplify the map/reduce updating by 
adding triggers on the source table that will add & remove rows from the index 
table.

Is there any other clever way I could do this? (I've been using SQLite for 14 
years and I still keep learning about more clever things it can do…)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick | Software Engineer | Scientific Games International GmbH | 
Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 
1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users