Re: Timestamp/hstore query?

2021-04-15 Thread Mike Rylander
On Wed, Apr 14, 2021 at 5:33 PM Brent Wood  wrote:
>
> Thanks for your reply,
>
> The table is essentially:
> create table readings (timertimestamp primary key,
>readings  hstore);
>
> the hstore comprises ( ) key/value pairs for readings 
> taken at the time specified in the timestamp.
>
> eg:  "67" "-45.67436", "68" "176.5424" could be key/value pairs representing 
> latitude & longitude, with a timestamp in the timer column.
>
> There would be several lat/lon hstore pairs in a given minute, the query I 
> want would return the last one in the timeseries for that minute (for each 
> key).
>
> I don't think your examples will give me the separate hstore key-value pairs, 
> extracted as the last in the interval for each key & reassembled as an hstore 
> list in the result. The sensor id is the hstore key, as described above, not 
> a separate attribute. That said, the keys can be extracted from the hstore 
> much like a column, but I'm not sure that is the best approach.
>
> Treating each set of hstores in an interval as an array & extracting the last 
> elements may be viable. But I['m not sure how...
>

Hi Brent,

With the table structure like that, you'll need to project all the
sensor data onto each timestamp and then collapse it.  Something like
this:

  SELECT date_trunc('minute',timer), key, first(value) over (partition
by key order by timer desc) FROM (SELECT timer, (each(readings)).*
FROM table WHERE date_trunc('minute',timer) = '2021-04-15
08:10:00-04')x;

If your queries will always target a specific timestamp truncated to
the minute, you'll find an index on date_trunc('minute',timer) will be
useful, I think.  If your intervals are more complex than date_trunc()
can handle then you'll have to get more creative, but that's the basic
structure -- normalize the primary key, project the sensor data to
make it table-ish, then use regular SQL to pull out what you want.

--
Mike Rylander
 | Research and Development Manager
 | Equinox Open Library Initiative
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@equinoxoli.org
 | web:  https://equinoxOLI.org

>
>
> On Thursday, April 15, 2021, 6:33:08 AM GMT+12, Michael Lewis 
>  wrote:
>
>
>
>
>
> If you share example schema and desired output (like a dummy table or even 
> pseudo code SQL), then I'm sure many people could help you. Right now, the 
> description of your desired result seems a bit unclear, at least to me.
>
>
> If you wanted to run this hourly for the last 1 hour, it sounds a bit like 
> want this-
>
> select sensor_id, date_trunc( 'minute', timestamptz_field_name_here ), last( 
> value_from_hstore ) over ( partition by sensor_id, date_trunc( 'minute', 
> timestamptz_field_name_here ) ) as last_value_recorded
> from data_table_here
> where timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()
> group by sensor_id, date_trunc( 'minute', timestamptz_field_name_here )
>
>
> You could also use the left join "where is null" pattern to check that a 
> prior record in the minute period of time does not exist for that same key. 
> Something like this-
>
> select d1.sensor_id, date_trunc( 'minute', d1.timestamptz_field_name_here ), 
> d1.timestamptz_field_name_here as last_value_recorded
> from data_table_here as d1
> left join data_table_here as prior_d1 ON prior_d1.sensor_id = d1.sensor_id 
> AND prior_d1.timestamptz_field_name_here < d1.timestamptz_field_name_here and 
> prior_d1.timestamptz_field_name_here >= date_trunc( 'minute', 
> d1.timestamptz_field_name_here )
> where d1.timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and 
> NOW()
>
>
> >
>
>




Re: Replacing Apache Solr with Postgre Full Text Search?

2020-03-26 Thread Mike Rylander
On Thu, Mar 26, 2020 at 4:03 AM J2eeInside J2eeInside
 wrote:
>
> Hi Mike, and thanks for valuable answer!
> In short, you think a PG Full Text Search can do the same as Apache Solr?
>

Can it?  I mean, it does today.  Whether it would for you depends on
your needs and how much effort you can afford to put into the stuff
that is /not/ the full text engine itself, like document normalizers
and search UIs.

There are trade-offs to be made when choosing any tool.  Solr is
great, and so is Lucene (Solr's heart), and so is Elastic Search.  For
that matter, Zebra is awesome for full text indexing, too.  Those all
make indexing a pile of documents easy.  But, none of those are great
as an authoritative data store, so for instance there will necessarily
be drift between your data and the Solr index requiring a full
refresh.  It's also hard to integrate non-document filtering
requirements like I have in my use case.  Both of those are important
to my use case, so PG's full text is my preference.

Solr also didn't exist (publicly) in 2004 when we started building Evergreen. :)

> P.S. I need to index .pdf, .html and MS Word .doc/.docx files, is there any 
> constraints in Ful Text search regarding those file types?
>

It can't handle those without some help -- it supports exactly text --
but you can extract the text using other tools.

--
Mike Rylander
 | Executive Director
 | Equinox Open Library Initiative
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@equinoxinitiative.org
 | web:  http://equinoxinitiative.org

>
> On Wed, Mar 25, 2020 at 3:36 PM Mike Rylander  wrote:
>>
>> On Wed, Mar 25, 2020 at 8:37 AM J2eeInside J2eeInside
>>  wrote:
>> >
>> > Hi all,
>> >
>> > I hope someone  can help/suggest:
>> > I'm currently maintaining a project that uses Apache Solr /Lucene. To be 
>> > honest, I wold like to replace Solr with Postgre Full Text Search. 
>> > However, there is a huge amount of documents involved - arround 200GB. 
>> > Wondering, can Postgre handle this efficiently?
>> > Does anyone have specific experience, and what should the infrastructure 
>> > look like?
>> >
>> > P.S. Not to be confused, the Sol works just fine, i just wanted to 
>> > eliminate one component from the whole system (if Full text search can 
>> > replace Solr at all)
>>
>> I'm one of the core developers (and the primary developer of the
>> search subsystem) for the Evergreen ILS [1] (integrated library system
>> -- think book library, not software library).  We've been using PGs
>> full-text indexing infrastructure since day one, and I can say it is
>> definitely capable of handling pretty much anything you can throw at
>> it.
>>
>> Our indexing requirements are very complex and need to be very
>> configurable, and need to include a lot more than just "search and
>> rank a text column," so we've had to build a ton of infrastructure
>> around record (document) ingest, searching/filtering, linking, and
>> display.  If your indexing and search requirements are stable,
>> specific, and well-understood it should be straight forward,
>> especially if you don't have to take into account non-document
>> attributes like physical location, availability, and arbitrary
>> real-time visibility rules like Evergreen does.
>>
>> As for scale, it's more about document count than total size.  There
>> are Evergreen libraries with several million records to search, and
>> with proper hardware and tuning everything works well.  Our main
>> performance issue has to do with all of the stuff outside the records
>> (documents) themselves that have to be taken into account during
>> search.  The core full-text search part of our queries is extremely
>> performant, and has only gotten better over the years.
>>
>> [1] http://evergreen-ils.org
>>
>> HTH,
>> --
>> Mike Rylander
>>  | Executive Director
>>  | Equinox Open Library Initiative
>>  | phone:  1-877-OPEN-ILS (673-6457)
>>  | email:  mi...@equinoxinitiative.org
>>  | web:  http://equinoxinitiative.org




Re: Replacing Apache Solr with Postgre Full Text Search?

2020-03-25 Thread Mike Rylander
On Wed, Mar 25, 2020 at 8:37 AM J2eeInside J2eeInside
 wrote:
>
> Hi all,
>
> I hope someone  can help/suggest:
> I'm currently maintaining a project that uses Apache Solr /Lucene. To be 
> honest, I wold like to replace Solr with Postgre Full Text Search. However, 
> there is a huge amount of documents involved - arround 200GB. Wondering, can 
> Postgre handle this efficiently?
> Does anyone have specific experience, and what should the infrastructure look 
> like?
>
> P.S. Not to be confused, the Sol works just fine, i just wanted to eliminate 
> one component from the whole system (if Full text search can replace Solr at 
> all)

I'm one of the core developers (and the primary developer of the
search subsystem) for the Evergreen ILS [1] (integrated library system
-- think book library, not software library).  We've been using PGs
full-text indexing infrastructure since day one, and I can say it is
definitely capable of handling pretty much anything you can throw at
it.

Our indexing requirements are very complex and need to be very
configurable, and need to include a lot more than just "search and
rank a text column," so we've had to build a ton of infrastructure
around record (document) ingest, searching/filtering, linking, and
display.  If your indexing and search requirements are stable,
specific, and well-understood it should be straight forward,
especially if you don't have to take into account non-document
attributes like physical location, availability, and arbitrary
real-time visibility rules like Evergreen does.

As for scale, it's more about document count than total size.  There
are Evergreen libraries with several million records to search, and
with proper hardware and tuning everything works well.  Our main
performance issue has to do with all of the stuff outside the records
(documents) themselves that have to be taken into account during
search.  The core full-text search part of our queries is extremely
performant, and has only gotten better over the years.

[1] http://evergreen-ils.org

HTH,
--
Mike Rylander
 | Executive Director
 | Equinox Open Library Initiative
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@equinoxinitiative.org
 | web:  http://equinoxinitiative.org




Re: Watching for view changes

2018-12-20 Thread Mike Rylander
On Thu, Dec 20, 2018 at 4:17 AM Mitar  wrote:
>
> Hi!
>
> I am trying to see how I could use NOTIFY/LISTEN to watch changes of a
> complicated SELECT query which spans multiple tables. Ideally, I would
> like to leave to PostgreSQL to determine when some data (and which
> data) in the result of the SELECT query has changed. So I am thinking
> that creating a temporary view using that query could be a way, only
> if I would find a way to watch such view for changes somehow.
>

If, as I assume, the FROM clause of the SELECT is essentially static,
just with varying WHERE conditions, you could just use an AFTER
INSERT/UPDATE/DELETE trigger on each of the tables involved to send a
NOTIFY whenever data that is exposed by the SELECT is changed.  You
can deliver a payload, such as the table name, primary key value and,
with a little work, even a list of fields that were modified as a JSON
blob, and let the application do whatever needs to be done to react
the the changes -- issue other queries, etc.

Of course that depends on your application knowing when it's
appropriate to NOTIFY, or being able to handle spurious NOTIFYs.

HTH,

--
Mike Rylander
 | Executive Director
 | Equinox Open Library Initiative
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@equinoxinitiative.org
 | web:  http://equinoxinitiative.org

> But it seems this is not really possible. I looked into two mechanisms:
>
> - Logical replication. Instead of NOTIFY/LISTEN I could simply create
> a publication over a view and then subscribe to it. But it seems
> logical replication can be done only over base tables and not views.
> [1]
> - Using "after" trigger on the view to get notification when the view
> gets changed. I could even use  transition relations to have
> information what changed. But sadly it seems that this is possible
> only if there is also INSTEAD OF trigger on the view. But I would like
> to get notification when the view has changed because underlying
> tables have changed, and not because of an UPDATE query on the view
> itself. Moreover, I do not really need writable views. [2]
>
> So I wonder if I am missing anything. Is there some other best
> practice how to get notifications when result of a query changes in
> real-time? And information what changed?
>
> How hard it would be to implement such triggers on a view for whenever
> a view changes? Is there a process to make a feature request?
>
> (Also, I have not really managed to get statement level "after"
> triggers to be run on a view for at all. Because if I rewrite a query
> with INSTEAD OF then triggers on those tables are triggered, not
> really view's. So not sure what is even expected use there.)
>
> [1] 
> https://www.postgresql.org/docs/devel/logical-replication-restrictions.html
> [2] https://www.postgresql.org/docs/devel/trigger-definition.html
>
>
> Mitar
>
> --
> http://mitar.tnode.com/
> https://twitter.com/mitar_m
>



Re: Select "todays" timestamps in an index friendly way

2018-10-23 Thread Mike Rylander
On Tue, Oct 23, 2018 at 5:38 AM Lutz Horn  wrote:
>
> Hi,
>
> I am looking for a way to select all timestamps that are "today" in an
> index friendly way. This select should not depend on the concrete value
> of "today".
>

Per TFM, https://www.postgresql.org/docs/10/static/datatype-datetime.html
on table 8.13, you can use special input values:

SELECT * FROM t WHERE ts >= 'today'::timestamp AND ts < 'tomorrow'::timestamp;

HTH,

--
Mike Rylander
 | Executive Director
 | Equinox Open Library Initiative
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@equinoxinitiative.org
 | web:  http://equinoxinitiative.org



Change ON UPDATE behavior of fkey

2018-03-15 Thread Mike Rylander
I'm wanting to change the ON UPDATE behavior of several foreign keys.
I know this has been "asked and answered" in the past, and I have a
query that will generate one-statement-per-fkey DDL, but while
tailoring that query I noticed that pg_constraint's confupdtype could
just be changed to 'c' directly.

So, my question is, will updating pg_constraint "just work" (for, say,
new sessions, after forcing clients to reconnect) or would there be a
data risk?

Thanks in advance,

--
Mike Rylander