Am 19.08.2016 um 11:21 schrieb Sameer Kumar:
On Fri, Aug 19, 2016 at 4:58 PM Thomas Güttler <[email protected]
<mailto:[email protected]>> wrote:
Am 19.08.2016 um 09:42 schrieb John R Pierce:
> On 8/19/2016 12:32 AM, Thomas Güttler wrote:
>> What do you think?
>
> I store most of my logs in flat textfiles syslog style, and use grep for
adhoc querying.
>
> 200K rows/day, thats 1.4 million/week, 6 million/month, pretty soon
you're talking big tables.
>
> in fact thats several rows/second on a 24/7 basis
There is no need to store them more then 6 weeks in my current use case.
I think indexing in postgres is much faster than grep.
And queries including json data are not possible with grep (or at least
very hard to type)
My concern is which DB (or indexing) to use ...
How will you be using the logs? What kind of queries? What kind of searches?
Correlating events and logs from various sources could be really easy with
joins, count and summary operations.
Wishes raise with possibilities. First I want to do simple queries about hosts and timestamps. Then some simple
substring matches.
Up to now to structured logging (the json column) gets created. But if it gets
filled, we will find a use case where
we use ssh+grep up to now.
Up to now we need no stemming and language support.
The kind of volume you are anticipating should be fine with Postgres but before
you really decide which one, you need to
figure out what would you want to do with this data once it is in Postgres.
The goal is a bit fuzzy up to now: Better overview.
Thank you for your feedback ("The kind of volume you are anticipating should be fine
with Postgres").
I guess I will use postgres, especial since Django ORM supports JSON in
Postgres:
https://docs.djangoproject.com/en/1.10/ref/contrib/postgres/fields/#jsonfield
Regards,
Thomas Güttler
--
Thomas Guettler http://www.thomas-guettler.de/
--
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general