Yes, this is a custom format. One option is to have fixed-length records, where each record (row) contains a fixed number of fixed-length fields. The field's meaning is determined by its position in the record. Eg, Pos. Meaning Type 0 CustomerID int 4 OrderID int 8 ProductType int 12 Quantity int 16 ...
Another idea (if you have sparse data, where many fields do not occur in many events) is to use a tag-value scheme. Market data often uses this mechanism. Binary formats are great for performance, but you'll find yourself creating additional tools to visualize the data. Usually not a lot of effort, but something to bear in mind. Self-describing text-based formats like xml and Jason have the advantage that they are human-readable but are hard to optimize. Sent from my iPhone > On 2015/06/04, at 0:41, Gary Gregory <[email protected]> wrote: > > Hi Remko: > > "length binary records" > > Is this a custom format? > > Gary > >> On Wed, Jun 3, 2015 at 7:35 AM, Remko Popma <[email protected]> wrote: >> MBs per minute should not be a problem for a modern HDD. >> >> See http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html >> >> Disk seeks are expensive though, so you'll want to buffer your I/O. Big >> sequential writes are relatively cheap. >> >> Log4j's RandomAccessFileAppender (optionally async to ensure bursts don't >> slow down your app) should be sufficient. >> >> For your data structure you could consider fixed length binary records. This >> is blazingly fast. >> In a low-latency trading app I was logging ~100MB/sec (slightly overdoing >> it, ahem). This wrote fixed length binary records to a memory mapped file. >> Worked very well. >> >> Sent from my iPhone >> >>> On 2015/06/03, at 17:21, Gary Gregory <[email protected]> wrote: >>> >>> I have a use case where I want an appender to write to local storage (a >>> file), the file should be structured, such that I can query it later (or >>> load it into a database). Since I will log a lot of data very fast >>> (possibly MBs a minute), I might or might not use log4j in async mode. But >>> the bottom line is that I'd like to log to local structured storage without >>> paying the cost of going through a database layer (NoSQL, JDBC) or a >>> socket. This makes me wonder if I should create a CSV file appender... >>> which would be easy enough. >>> >>> Does anyone here have experience with a use case like this? I'm not crazy >>> about running MongoDB on the side just to gather logging, but maybe that's >>> what I'll need to do... >>> >>> Thoughts? >>> >>> Gary >>> >>> -- >>> E-Mail: [email protected] | [email protected] >>> Java Persistence with Hibernate, Second Edition >>> JUnit in Action, Second Edition >>> Spring Batch in Action >>> Blog: http://garygregory.wordpress.com >>> Home: http://garygregory.com/ >>> Tweet! http://twitter.com/GaryGregory > > > > -- > E-Mail: [email protected] | [email protected] > Java Persistence with Hibernate, Second Edition > JUnit in Action, Second Edition > Spring Batch in Action > Blog: http://garygregory.wordpress.com > Home: http://garygregory.com/ > Tweet! http://twitter.com/GaryGregory
