Re: Spark for offline log processing/querying

2016-05-23 Thread Mat Schaffer
tics queries will > probably be much faster on ELK. If your queries are more interactive and > not about batch processing then it does not make so much sense. I am not > sure why you plan to use Presto. > > On 23 May 2016, at 07:28, Mat Schaffer <m...@schaffer.me> wrote: > >

Spark for offline log processing/querying

2016-05-22 Thread Mat Schaffer
I'm curious about trying to use spark as a cheap/slow ELK (ElasticSearch,Logstash,Kibana) system. Thinking something like: - instances rotate local logs - copy rotated logs to s3 (s3://logs/region/grouping/instance/service/*.logs) - spark to convert from raw text logs to parquet - maybe presto to