Hi Emily,

Not sure what is your exact case for sending out data to Heka.
Usually I find it much more easy to use JSON or similar plain text format
to sending messages to Heka unless you have tight requirements for
throughput.

In my tests I've seen throughputs of ~1K messages/second (~10Mbit/s) on
c4.large instance on AWS using stock lua JSON decoder/encoder and HTTP
output/input.
If you are expecting smaller throughputs you should probably look into that
-- at least until you get used to Heka and to how it works.

Best regards,
Timur

On Fri, Jan 8, 2016 at 3:22 AM, Emily Gu <77.e...@gmail.com> wrote:

> This is working. Thanks!
>
> I'm confusing on the two instances parts and also others.
>
> Yes, I need to send our custom data into Heka. I want to see if I need to
> write my own custom Heka plugin or leverage existing Heka plugins. My
> custom data is a slice of metrics can send into Heka through TCP.
>
> Your suggestion is very much appreciated.
>
> Thanks,
> Emily
>
> On Thu, Jan 7, 2016 at 4:10 PM, Rob Miller <rmil...@mozilla.com> wrote:
>
>> From what I can tell (and it's not very clear), it looks like you've got
>> one Heka instance running that has only a TcpInput, nothing else. That will
>> accept data, but it's not going to do anything with that data.
>>
>> Then you've got a separate Heka config that contains no inputs, but only
>> a TcpOutput (pointing at the input that's specified in the other config)
>> and a FileOutput. These outputs might conceivably send data somewhere, but
>> there are no inputs, so it's not clear where that data would come from.
>>
>> Drop the TcpOutput altogether, and combine the TcpInput and the
>> FileOutput into a single config:
>>
>> [hekad]
>> maxprocs = 1
>> share_dir = "/Users/egu/heka/share/heka"
>>
>> [tcp_in:3242]
>> type = "TcpInput"
>> splitter = "HekaFramingSplitter"
>> decoder = "ProtobufDecoder"
>> address = ":3242"
>>
>> [tcp_heka_output_log]
>> type = "FileOutput"
>> message_matcher = "TRUE"
>> path = "/tmp/output.log"
>> perm = "664"
>> encoder = "tcp_heka_output_encoder"
>>
>> [tcp_heka_output_encoder]
>> type = "PayloadEncoder"
>> append_newlines = false
>>
>>
>> Once you've done that, you should be able to use `heka-inject` to send a
>> message into your running Heka:
>>
>> $ heka-inject -heka 127.0.0.1:3242 -payload "1212 this is just a test"
>>
>> If you want to send custom data in through that TcpInput, then you'll
>> have to switch to using a different splitter and a different decoder, the
>> default setup you're using will only know how to handle Heka protobuf
>> streams.
>>
>> -r
>>
>>
>>
>>
>> On 01/07/2016 03:48 PM, Emily Gu wrote:
>>
>>> Thanks you both Rob and David very much!
>>>
>>> Not sure where I need to define "base_dir"?
>>>
>>> I'm going to write a Heka plugin to pass our metrics data into Heka.
>>>
>>> For now, I have a hard time to see the data I send in through
>>> TCP programmatically through TcpInput in the output.log file.
>>> I don't see any output.  The configs are:
>>>
>>> tcp_input.toml
>>> ============
>>>
>>> [hekad]
>>>
>>> maxprocs = 1
>>>
>>> share_dir = "/Users/egu/heka/share/heka"
>>>
>>>
>>> [tcp_in:3242]
>>>
>>> type = "TcpInput"
>>>
>>> splitter = "HekaFramingSplitter"
>>>
>>> decoder = "ProtobufDecoder"
>>>
>>> address = ":3242"
>>>
>>>
>>> tcp_output.toml
>>>
>>> ==============
>>>
>>> [hekad]
>>>
>>> maxprocs = 1
>>>
>>> share_dir = "/Users/egu/heka/share/heka"
>>>
>>>
>>> [tcp_out:3242]
>>>
>>> type = "TcpOutput"
>>>
>>> message_matcher = "TRUE"
>>>
>>> address = "127.0.0.1:3242 <http://127.0.0.1:3242>"
>>>
>>>
>>> [tcp_heka_output_log]
>>>
>>> type = "FileOutput"
>>>
>>> message_matcher = "TRUE"
>>>
>>> path = "/tmp/output.log"
>>>
>>> perm = "664"
>>>
>>> encoder = "tcp_heka_output_encoder"
>>>
>>>
>>> [tcp_heka_output_encoder]
>>>
>>> type = "PayloadEncoder"
>>>
>>> append_newlines = false
>>>
>>>
>>> The client:
>>>
>>> package main
>>>
>>>
>>> import (
>>>
>>>      "fmt"
>>>
>>>      "github.com/mozilla-services/heka/client
>>> <http://github.com/mozilla-services/heka/client>"
>>>
>>> )
>>>
>>>
>>>
>>> func main() {
>>>
>>>      message_bytes := []byte {100}
>>>
>>>
>>>      sender, err := client.NewNetworkSender("tcp", "127.0.0.1:3242
>>> <http://127.0.0.1:3242>")
>>>
>>>      if err != nil {
>>>
>>>          fmt.Println("Could not connect to", "127.0.0.1:3242
>>> <http://127.0.0.1:3242>")
>>>
>>>          return
>>>
>>>      }
>>>
>>>      fmt.Println("Connected")
>>>
>>>      var i int
>>>
>>>      for i = 0; i < 10; i++ {
>>>
>>> fmt.Println("message byte:", string(message_bytes))
>>>
>>>          err = sender.SendMessage(message_bytes)
>>>
>>>          if err != nil {
>>>
>>>              break
>>>
>>>          }
>>>
>>>      }
>>>
>>>      fmt.Println("sent", i, "messages")
>>>
>>> }
>>>
>>>
>>>
>>> Please let me know what else I need to change.
>>>
>>> Thanks,
>>>
>>> Emily
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jan 7, 2016 at 3:28 PM, David Birdsong <david.birds...@gmail.com
>>> <mailto:david.birds...@gmail.com>> wrote:
>>>
>>>
>>>
>>>     On Thu, Jan 7, 2016 at 3:22 PM, Rob Miller <rmil...@mozilla.com
>>>     <mailto:rmil...@mozilla.com>> wrote:
>>>
>>>         On 01/07/2016 03:09 PM, Emily Gu wrote:
>>>
>>>             Thanks David for all the help! I'll give it a try.
>>>
>>>             Please bear with me as some parts I still not understand.
>>>
>>>             1. Why do I have to run two Heka instances where one for
>>>             input and
>>>             another for output?
>>>
>>>
>>>         Because if you send the output from a Heka instance back into
>>>         itself, then you're likely setting up an infinite loop of
>>>         traffic that will spin out of control.
>>>
>>>             2. Did you mean I need to specify different share_dirs in
>>>             input and
>>>             output Toml configs?
>>>
>>>
>>>         If you're running multiple Heka instances on a single machine,
>>>         it *should* be fine for them to use the same share_dir, which is
>>>         read-only. It's very important that each specifies a unique
>>>         base_dir, however, since that's used by Heka for internal
>>>         bookkeeping data. Two Heka's using the same base_dir is asking
>>>         for trouble.
>>>
>>>             3. Do I need both TcpOutput and FileOutput in order for me
>>>             to see
>>>             messages inside an output file? What if I didn't specify
>>>             TcpOutput?
>>>
>>>
>>>         Um, TcpOutput sends output data over a TCP connection. It
>>>         expects that there is a listener on the other side which will
>>>         accept that TCP connection, and will know how to correctly
>>>         handle the data that Heka is sending over the TCP connection.
>>>
>>>         FileOutput sends data to a file on the local file system.
>>>
>>>         It's of course fine to specify a FileOutput without specifying a
>>>         TcpOutput.
>>>
>>>         -r
>>>
>>>
>>>     whoops, yes I meant base_dir for where heka writes various internal
>>>     state information to.
>>>
>>>     Emily,
>>>
>>>     Maybe you could share what data you're trying to read into heka and
>>>     what you would like to do with it and we could help get you going.
>>>
>>>     Heka intended to a uni-directional pipeline. It can read data in
>>>     from many places into various formats, aggregate into interesting
>>>     new formats, and finally emit data somewhere else.
>>>
>>>
>>>
>
> _______________________________________________
> Heka mailing list
> Heka@mozilla.org
> https://mail.mozilla.org/listinfo/heka
>
>
_______________________________________________
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka

Reply via email to