On 12 August 2015 at 18:16, Michael Trinkala <mtrink...@mozilla.com> wrote:

> Most looping requirements can be flatten out: i.e., alerting can be
> handled in the output plugins in your example and
> aggregation/sessionization etc can be handled in the inputs. As for
> sharing: things fall into place when you start thinking about it from a
> module level instead of an individual plugin level.
>
> The locations are configurable 'output_path' so you can put the output
> files anywhere you want.
>

Good, I will set it somewhere on the /tmp/ partition, which is already a
stored in RAM.


> I have some plugins (like a stdin, simple file, TCP. and a pruning (cleans
> up the output files when everyone in done with them) inputs;
>

I'm interested by this pruning plugin: could you share it? Or don't you
think it would make sense for Hindsight to provide an option to
automaticallly clean up output files?


> heka protobuf and a payload outputs etc. but they haven't commited yet).
> As for ProcessInput the Input sandbox has access to os.execute so there
> won't be a generic version you can just call what you want and handle its
> output directly (Hindsight already supports run once, polling, and
> continuous input plugins)
>

Can you explain briefly how to configure input plugins for these 3 options
(run once, polling, or continuous)?


> The output files will grow until 'output_size' (defaults to 64MiB) before
> they are rolled (they are not deleted by default).
>

Ok. Would it make sense to introduce an option to delete them by default?


> I would not make it too small unless you need to prune really quickly
> generally I run with a config of 1GIB (on some of our systems that rolls
> several times a minute) space permitting I would just roll them after
> several minutes of what would be average data flow on your system)
>

I plan to use a ram disk (tmpfs) to store these files (to avoid writing too
much on the sd card), so I would like them to stay small in order to
prevent eating too much RAM: does 5MB to 10MB seems reasonable to you?

Also, do you plan to provide a debian package for Hindsight to ease
installation?

Thanks for the great help!
Bruno


>
> Trink
>
>
>
>
>
> On Wed, Aug 12, 2015 at 1:55 AM, bruno binet <bruno.bi...@gmail.com>
> wrote:
>
>>
>>
>> On 12 August 2015 at 10:19, bruno binet <bruno.bi...@gmail.com> wrote:
>>
>>> Thanks for all this valuable information.
>>>
>>> On 11 August 2015 at 17:17, Michael Trinkala <mtrink...@mozilla.com>
>>> wrote:
>>>
>>>> There are a few intentional changes between Heka and Hindsight.
>>>> Looping messages in Heka has always been a bad idea so it was removed.
>>>>
>>>
>>> Personally I like the looping messages feature in Heka as it is very
>>> flexible and could be useful to share ready-to-use plugins. Also it
>>> supports processing messages through multiple ticker_interval which can be
>>> useful (alerting, aggregations).
>>>
>>>
>>>> There are a few API enhancements such as a protobuf stream reader and
>>>> writer.  Checkpoint are all managed by the Hindsight infrastructure (so
>>>> much of the burden is removed from the plugin writer, this also alters the
>>>> plugin API slightly).  The write_message hack for Go has been removed since
>>>> messages are immutable.  read_config now has access to all related sandbox
>>>> config options (standard and user defined). read_next_field is not
>>>> supported (this will also be removed from Heka in 0.11).
>>>>
>>>> In most cases you will find the Hindsight IOPS lower than Heka due to
>>>> the much more efficient check pointing  (btw Heka 0.11 is moving to a disk
>>>> buffer everywhere).
>>>>
>>>
>>> Great, that is good to know.
>>>
>>>
>>>> output_hi/input/* - contains the output from all of the input plugins
>>>> output_hi/analysis/* - contains the output from all of the analysis
>>>> plugins
>>>>
>>>
>> Are the above files always growing?
>> I suppose the output_limit configuration allow us limit their size: what
>> are the implications if I limit their size to a few KB? Will it reduce
>> Hindsight performance?
>>
>>
>>> hindsight.cp - in the checkpoint file for all I/O (inputs, analysis, and
>>>> output plugins)
>>>> hindsight.tsv - in the self monitoring performance stats
>>>>
>>>> They files are all mandatory.  They are the reason Hindsight has an at
>>>> least once delivery guarantee and they provide valuable insight on system
>>>> operation and performance.
>>>>
>>>
>>> If I don't need delivery guarantee, do you think it could make sense to
>>> move these files to a ramdisk (tmpfs) partition in order to preserve the
>>> flash sd/usb card?
>>>
>>>
>>>> decode_message needs to be turned on for analysis and output plugins, I
>>>> will enable it.
>>>>
>>>
>>> Ok, thank you: this is now working as expected.
>>>
>>> Also, do you plan to implement some additional lua modules to help build
>>> input sandboxes similar to Heka input plugins (like the FilePollingInput,
>>> the ProcessInput, or the LogstreamerInput)?
>>>
>>>
>>>>
>>>> Trink
>>>>
>>>>
>>>> On Tue, Aug 11, 2015 at 1:54 AM, bruno binet <bruno.bi...@gmail.com>
>>>> wrote:
>>>>
>>>>> I see, so I need to investigate how I can merge my multiple lua
>>>>> sandbox filters into a single one.
>>>>>
>>>>> This make me wondering if there is some other differences between
>>>>> Hindsight and Heka?
>>>>> The fact that only one analysis plugin cannot consume the output of
>>>>> another analysis plugin is the only difference beween Hindsight analysis
>>>>> plugins and Heka filter sandbox plugins?
>>>>>
>>>>> Also I saw in another thread that Hindsight uses disk buffers at every
>>>>> stage, so there's only ever one
>>>>> message in memory at every step of the pipeline: does it mean
>>>>> Hindsight will write much more frequently to the disk than Heka? This may
>>>>> be an issue for me since we use a raspberry pi which disk is a sdcard or
>>>>> usb flash key.
>>>>>
>>>>> I see that some data is written to the output_path (output_hl/
>>>>> directory in my case): can you explain what are all these files:
>>>>> $ tree output_hl/
>>>>> output_hl/
>>>>> |-- analysis
>>>>> |   `-- 0.log
>>>>> |-- hindsight.cp
>>>>> |-- hindsight.tsv
>>>>> `-- input
>>>>>     `-- 0.log
>>>>>
>>>>> Can we avoid generating all these files?
>>>>>
>>>>> Last question: I don't manage to use the "read_next_field" or
>>>>> "decode_message" api function from the output plugin, are they available?
>>>>>
>>>>> The following error is returned:
>>>>> 1439280624615780495 [error] output_plugins terminated:
>>>>> output/encode_metric.cfg msg: process_message()
>>>>> _hl/output/encode_metric.lua:16: attempt to call global 'read_next_field'
>>>>> (a nil value)
>>>>>
>>>>> or when I change my output plugin to use the decode_message api
>>>>> function:
>>>>> 1439282566555139226 [error] output_plugins terminated:
>>>>> output/encode_metric.cfg msg: process_message()
>>>>> _hl/output/encode_metric.lua:15: attempt to call global 'decode_message' 
>>>>> (a
>>>>> nil value)
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Bruno
>>>>>
>>>>>
>>>>> On 10 August 2015 at 19:19, Michael Trinkala <mtrink...@mozilla.com>
>>>>> wrote:
>>>>>
>>>>>> There is no message looping in Hindsight (one analysis plugin cannot
>>>>>> consume the output of another analysis plugin).  In your example the
>>>>>> decoding should happen in the input.  Heka has Inputs, splitters, and
>>>>>> decoder (in Hindsight it is just an Input and common functionality can be
>>>>>> split into modules for code reuse).  This in general simplifies the
>>>>>> configuration, is easier to follow (since everything is in one place) and
>>>>>> has performance benefits as well.
>>>>>>
>>>>>> Trink
>>>>>>
>>>>>> On Mon, Aug 10, 2015 at 9:23 AM, bruno binet <bruno.bi...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Back from vacations, I'm now playing again with Hindsight on a
>>>>>>> raspberry pi.
>>>>>>> As reported on github
>>>>>>> https://github.com/trink/hindsight/issues/1#issuecomment-119593775
>>>>>>> the compilation now succeeds.
>>>>>>>
>>>>>>> So getting inspiration from the examples in the benchmarks
>>>>>>> directory, I tried to create a Hindsight configuration to use my own lua
>>>>>>> sandboxes: I can successfully read data from udp and use a filter to 
>>>>>>> decode
>>>>>>> data, then I would like to use another filter to handle generated 
>>>>>>> messages,
>>>>>>> but I can't get any message in the second filter. Does Hindsight support
>>>>>>> more than one filter like Heka?
>>>>>>>
>>>>>>> Here is the Hindsight configuration, Lua sandboxes and output
>>>>>>> directory generated by Hindsight:
>>>>>>> https://github.com/bbinet/hindsight_hl_test
>>>>>>>
>>>>>>> Do you see anything wrong? Do I use hindsight correctly?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Bruno
>>>>>>>
>>>>>>> On 8 July 2015 at 09:44, bruno binet <bruno.bi...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Sure, I will try your branch and report possible new compilation
>>>>>>>> issues in github.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Bruno
>>>>>>>>
>>>>>>>> On 7 July 2015 at 18:26, Michael Trinkala <mtrink...@mozilla.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I changed the checkpoint id to an unsigned long long. Can you test
>>>>>>>>> out the branch and add any other compilation errors to the issue 
>>>>>>>>> (closing
>>>>>>>>> out this email thread).  I am also taking suggestions/recommendations 
>>>>>>>>> for a
>>>>>>>>> CI build system that supports multiple platforms.  TravisCI adds 
>>>>>>>>> almost no
>>>>>>>>> value since I am already building on a Debian based box.
>>>>>>>>>
>>>>>>>>> https://github.com/trink/hindsight/tree/issue_1
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Trink
>>>>>>>>>
>>>>>>>>> On Tue, Jul 7, 2015 at 8:21 AM, bruno binet <bruno.bi...@gmail.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> Ok, thanks.
>>>>>>>>>> And sorry, but I don't have a patch (don't know how to fix this
>>>>>>>>>> kind of compilation issue).
>>>>>>>>>>
>>>>>>>>>> On 7 July 2015 at 16:17, Michael Trinkala <mtrink...@mozilla.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yeah, I have only been building on Ubuntu and haven't done any
>>>>>>>>>>> cross platform clean-up.  Thanks for the build output I will fix 
>>>>>>>>>>> those
>>>>>>>>>>> errors (unless you already have a patch).
>>>>>>>>>>>
>>>>>>>>>>> Trink
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 7, 2015 at 5:57 AM, bruno binet <
>>>>>>>>>>> bruno.bi...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I now have some time to do a few tests with Hindsight, so I
>>>>>>>>>>>> tried to compile it on our targeted arm platform (raspberry pi), 
>>>>>>>>>>>> but I get
>>>>>>>>>>>> the following error:
>>>>>>>>>>>>
>>>>>>>>>>>> root@hl-mc-9999-dev:~/hindsight/release# cmake
>>>>>>>>>>>> -DCMAKE_BUILD_TYPE=release ..
>>>>>>>>>>>> -- The C compiler identification is GNU 4.7.2
>>>>>>>>>>>> -- The CXX compiler identification is GNU 4.7.2
>>>>>>>>>>>> -- Check for working C compiler: /usr/bin/gcc
>>>>>>>>>>>> -- Check for working C compiler: /usr/bin/gcc -- works
>>>>>>>>>>>> -- Detecting C compiler ABI info
>>>>>>>>>>>> -- Detecting C compiler ABI info - done
>>>>>>>>>>>> -- Detecting C compile features
>>>>>>>>>>>> -- Detecting C compile features - done
>>>>>>>>>>>> -- Check for working CXX compiler: /usr/bin/g++
>>>>>>>>>>>> -- Check for working CXX compiler: /usr/bin/g++ -- works
>>>>>>>>>>>> -- Detecting CXX compiler ABI info
>>>>>>>>>>>> -- Detecting CXX compiler ABI info - done
>>>>>>>>>>>> -- Detecting CXX compile features
>>>>>>>>>>>> -- Detecting CXX compile features - done
>>>>>>>>>>>> -- Found LUASANDBOX: /usr/local/lib/libluasandbox.so
>>>>>>>>>>>> -- Configuring done
>>>>>>>>>>>> -- Generating done
>>>>>>>>>>>> -- Build files have been written to: /root/hindsight/release
>>>>>>>>>>>>
>>>>>>>>>>>> root@hl-mc-9999-dev:~/hindsight/release# make
>>>>>>>>>>>> Scanning dependencies of target hindsight
>>>>>>>>>>>> [  2%] Building C object src/CMakeFiles/hindsight.dir/
>>>>>>>>>>>> hindsight.c.o
>>>>>>>>>>>> [  4%] Building C object src/CMakeFiles/hindsight.dir/
>>>>>>>>>>>> hs_analysis_plugins.c.o
>>>>>>>>>>>> [  6%] Building C object src/CMakeFiles/hindsight.dir/
>>>>>>>>>>>> hs_checkpoint_reader.c.o
>>>>>>>>>>>> /root/hindsight/src/hs_checkpoint_reader.c: In function
>>>>>>>>>>>> 'find_first_id':
>>>>>>>>>>>> /root/hindsight/src/hs_checkpoint_reader.c:46:3: error: large
>>>>>>>>>>>> integer implicitly truncated to unsigned type [-Werror=overflow]
>>>>>>>>>>>> /root/hindsight/src/hs_checkpoint_reader.c:55:3: error:
>>>>>>>>>>>> comparison is always false due to limited range of data type
>>>>>>>>>>>> [-Werror=type-limits]
>>>>>>>>>>>> cc1: all warnings being treated as errors
>>>>>>>>>>>> src/CMakeFiles/hindsight.dir/build.make:100: recipe for target
>>>>>>>>>>>> 'src/CMakeFiles/hindsight.dir/hs_checkpoint_reader.c.o' failed
>>>>>>>>>>>> make[2]: *** 
>>>>>>>>>>>> [src/CMakeFiles/hindsight.dir/hs_checkpoint_reader.c.o]
>>>>>>>>>>>> Error 1
>>>>>>>>>>>> CMakeFiles/Makefile2:947: recipe for target
>>>>>>>>>>>> 'src/CMakeFiles/hindsight.dir/all' failed
>>>>>>>>>>>> make[1]: *** [src/CMakeFiles/hindsight.dir/all] Error 2
>>>>>>>>>>>> Makefile:146: recipe for target 'all' failed
>>>>>>>>>>>> make: *** [all] Error 2
>>>>>>>>>>>>
>>>>>>>>>>>> Do you know what is going on here? I guess this is an issue
>>>>>>>>>>>> with the arm platform only?
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Bruno
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 10 June 2015 at 18:41, bruno binet <bruno.bi...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks a lot for your answers.
>>>>>>>>>>>>>
>>>>>>>>>>>>> And yes, I'm very interested in bootstrapping a first
>>>>>>>>>>>>> prototype of my own data pipeline based on Hindsight so that I 
>>>>>>>>>>>>> can compare
>>>>>>>>>>>>> the performance on a raspberry pi.
>>>>>>>>>>>>> (here is the current state of our Heka-based data pipeline:
>>>>>>>>>>>>> https://bitbucket.org/helioslite/heka-hl-sandboxes)
>>>>>>>>>>>>> So it would be great if you can give me the first instructions
>>>>>>>>>>>>> on how to build and setup Hindsight.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>> Bruno
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 10 June 2015 at 18:18, Michael Trinkala <
>>>>>>>>>>>>> mtrink...@mozilla.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> - It is usable and being actively developed with the intent
>>>>>>>>>>>>>> to move it into production later this year.
>>>>>>>>>>>>>> - We are currently running production data through it for
>>>>>>>>>>>>>> testing but it is not deployed in an official capacity.  It has 
>>>>>>>>>>>>>> been very
>>>>>>>>>>>>>> stable but until a more robust set of tests have been build out 
>>>>>>>>>>>>>> I will not
>>>>>>>>>>>>>> consider it production ready.
>>>>>>>>>>>>>> - Yes, it can decode/encode Heka protobuf format
>>>>>>>>>>>>>> - Yes, the router/message matcher is complete.  The only
>>>>>>>>>>>>>> difference is that it supports Lua string pattern matching 
>>>>>>>>>>>>>> instead of re2
>>>>>>>>>>>>>> regexp  (Heka 'Hostname =~ /^foo/' vs Hindsight 'Hostname =~ 
>>>>>>>>>>>>>> "^foo"')
>>>>>>>>>>>>>> - Yes, but you would need a lua-socket input and output
>>>>>>>>>>>>>> sandbox (see benchmarks/hsr_run for related examples)
>>>>>>>>>>>>>> - No documentation yet, only examples in the benchmarks
>>>>>>>>>>>>>> directory.  I could have you bootstrapped in about 30 minutes 
>>>>>>>>>>>>>> (and
>>>>>>>>>>>>>> hopefully turn that into a getting started guide) if you are 
>>>>>>>>>>>>>> interested.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Implementation wise the only missing piece is support for
>>>>>>>>>>>>>> dynamically loading plugins.  The actual code to accomplish it 
>>>>>>>>>>>>>> is very
>>>>>>>>>>>>>> small (just detecting files in the load directory and moving 
>>>>>>>>>>>>>> them to the
>>>>>>>>>>>>>> run directory) but ideally it would be fronted by a web server 
>>>>>>>>>>>>>> and a GUI
>>>>>>>>>>>>>> with access control and validation (a much larger effort and 
>>>>>>>>>>>>>> actually a
>>>>>>>>>>>>>> separate project).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Trink
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Jun 10, 2015 at 8:15 AM, bruno binet <
>>>>>>>>>>>>>> bruno.bi...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I recently discovered the work pushed into the Hindsight
>>>>>>>>>>>>>>> repository (https://github.com/trink/hindsight) which seems
>>>>>>>>>>>>>>> to be a lightweight alternative to Heka, based on the lua 
>>>>>>>>>>>>>>> sandbox.
>>>>>>>>>>>>>>> The Hindsight vs Heka benchmarks are quite impressive.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm currently running Heka on the raspberry pi (not so
>>>>>>>>>>>>>>> powerful) device and the load average quickly increases and 
>>>>>>>>>>>>>>> exceeds 1 when
>>>>>>>>>>>>>>> Heka is ingesting data, so Hindsight could be a good fit for us 
>>>>>>>>>>>>>>> if it can
>>>>>>>>>>>>>>> perform better than Heka in terms of CPU cycles.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What is the current status of Hindsight? Is it just an
>>>>>>>>>>>>>>> temporary experiment or will it be maintained and actually used 
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>> production?
>>>>>>>>>>>>>>> Is it currently usable and stable?
>>>>>>>>>>>>>>> Is Hindsight able to decode and encode Heka protobuf format?
>>>>>>>>>>>>>>> Does Hindsight have a complete router implementation to
>>>>>>>>>>>>>>> dispatch messages to sandboxes like in Heka?
>>>>>>>>>>>>>>> My use case is basically to read raw text data from UDP
>>>>>>>>>>>>>>> socket, parse text data with lua patterns or lpeg, process data 
>>>>>>>>>>>>>>> through a
>>>>>>>>>>>>>>> few lua sandbox filters, then write output messages both to a 
>>>>>>>>>>>>>>> file
>>>>>>>>>>>>>>> (protobuf heka format) and a HTTP server (json format): can 
>>>>>>>>>>>>>>> this be easily
>>>>>>>>>>>>>>> accomplished with Hindsight?
>>>>>>>>>>>>>>> Is there any documentation somewhere to get started with
>>>>>>>>>>>>>>> Hindsight?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Bruno
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Heka mailing list
>>>>>>>>>>>>>>> Heka@mozilla.org
>>>>>>>>>>>>>>> https://mail.mozilla.org/listinfo/heka
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka

Reply via email to