Steve, Thank you so much for the information! It is a huge help in understanding the results!
I am going to perform similar tests using the JAVA API. Rob On 2019/11/08 16:36:46, Steve Lawrence <slawre...@apache.org> wrote: > Hi Rob, > > I don't think you are subscribed to the dev list. I'd recommend you > subscribe so you don't miss any responses if someone forgets to reply all. > > Before answering your questions, I'll reiterate that the -N option says > how many times to repeat the parse, the -t option says how many threads > to use. So the performance command will parse the test_file.csv file N > times. If t is not 1, it will parallelize those N parses across t threads. > > Total parse time is just the wall clock time from the time the first > parse starts to the time the last parse finishes. So in your example it > took about 2.4 seconds to parse test_file.csv 100 times with 5 threads. > > The min rate is determined by finding the parse that took the longest of > those N parses and calculating how fast you could parse if it always > took that long. This is just 1 / longest_parse_time. > > The max rate is the same, but uses the shortest parse time. > > The average rate is could really be thought of as throughput. This is > calculated by taking the total number_of_parses / wall_clock_time. This > can essentially be thought of as throughput. Usually > increasing/decreasing the number of threads will slow down the max rate > but increase the average rate, since more threads generally means more > throughput (until it doesn't and makes things worse ;), > > Note that min and max rates are often very different (usually orders of > magnitude) because the first bunch of parses take a while for the Java > do JIT compiling and optimizations. So the max rate is what you are more > likely to see once in production once the JVM is warmed up and a bunch > of parses have completed. > > And you can see this in your results. Parsing 200 files gave you a max > rate of 107 files/second, but parsing 1000 gave you a max of 300. At > some point, as you increase the number of files the max rate will stop > getting better, which essentially means the JVM is fully warm and > optimized, and that's the fastest rate you'll get. > > As to wheather you numbers are good or not, they don't seem particularly > good. On my laptop, with the csv schema and a small csv file, at about > -N 100,000 files and a single thread I get to around 12000 files/sec, so > many times faster than yours. > > Unfortunately, I don't have any suggestions specific to ARM. I will say > that Daffodil can be pretty memory hungry, so usually giving more memory > to the JVM helps performance. But the Daffodil CLI defaults to 1024MB, > so you might not be able to bump it much more with your limited RAM. > > The other suggestion I have is to decrease the number of threads and see > how things improve. Some libraries we use in Daffodil are not thread > safe, so we have to use things like ThreadLocals which likely incurs > some overhead and slows things down. > > - Steve > > > > > On 11/8/19 10:45 AM, Rose, Rob P wrote: > > All, > > > > I am trying to port the Apache daffodil libraries onto an > > cross > > domain guard that runs in a very small form factor. > > > > We have cross compiled OpenJDK 12 for the aarch64 (ARM > > processor) and loaded into memory. > > > > I have built the source using sbt (sbt daffodil-cli/stage) > > and > > loaded the necessary jars into memory on the board. > > > > Here are some of the specifics of the hardware platform > > running > > on this guard: > > > > ·2 GB DDR RAM > > > > oMemory Management Unit (MMU) Page Tables used in this system are > > one-to-one > > mapping. > > > > ·ARM Cortex A53 4 Core Processor > > > > Here are some the specifics for the software components > > > > ·SELinux > > > > ·Busybox > > > > Here is some of the performance numbers we are seeing from the performance > > testing: > > > > *NOTE: These tests were run using the attached csv file and the attached > > schema* > > > > ** > > > > ** > > > > # ./daffodil performance -s demo/csv.dfdl.xsd -N 100 -t 5 demo/test_file.csv > > > > total parse time (sec): 2.443824 > > > > ·What does the total parse time value mean ? > > > > ·How is it calculated ? > > > > ·Is this poor performance? > > > > min rate (files/sec): 1.535568 > > > > ·What is the min rate (files/sec) What does this mean ? > > > > max rate (files/sec): 29.460340 > > > > ·What is the max rate (files/sec) What does this mean ? > > > > avg rate (files/sec): 40.919485 > > > > ·What is the avg rate (files/sec) What does this mean ? > > > > ·Do you have any suggestions how to improve parse/unparsed speed on an ARM > > processor? > > > > ·Any suggestions are greatly appreciated! > > > > # ./daffodil performance -s demo/csv.dfdl.xsd -N 200 -t 5 demo/test_file.csv > > > > total parse time (sec): 3.175893 > > > > min rate (files/sec): 1.520884 > > > > max rate (files/sec): 107.223428 > > > > avg rate (files/sec): 62.974409 > > > > # ./daffodil performance -s demo/csv.dfdl.xsd -N 300 -t 5 demo/test_file.csv > > > > total parse time (sec): 3.656587 > > > > min rate (files/sec): 1.551273 > > > > max rate (files/sec): 180.155186 > > > > avg rate (files/sec): 82.043712 > > > > # ./daffodil performance -s demo/csv.dfdl.xsd -N 1000 -t 5 > > demo/test_file.csv > > > > total parse time (sec): 5.602554 > > > > min rate (files/sec): 1.459977 > > > > max rate (files/sec): 301.144046 > > > > avg rate (files/sec): 178.490026 > > > > Sincerely, > > > > Rob Rose > > > > Sr. Principal Software Engineer > > > > General Dynamics Mission Systems > > > > Office: 508-880-1866 > > > > Cell: 508-341-5216 > > > > /This message and/or attachments may include information subject to GD > > Corporate > > Policies 07-103 and 07-105 and is intended to be accessed only by > > authorized > > recipients. Use, storage and transmission are governed by General Dynamics > > and > > its policies. Contractual restrictions apply to third parties. Recipients > > should refer to the policies or contract to determine proper handling. > > Unauthorized review, use, disclosure or distribution is prohibited. If you > > are > > not an intended recipient, please contact the sender and destroy all copies > > of > > the original message./ > > > >