Hi Greg,

What I did with my Motion Chart software (
http://www.eshiftlog.com/Silverlight/MotionGraphTestPage.html) to get
better download performance was:
• Move away from small WCF data transfers to transferring a single large
encoded compressed text file
• Only transfer raw data (no JSON/XML structure, which adds a LOT OF FAT)
• Minor use of CSV format, otherwise fixed format
• Define my own number formats to reduce size (remove unneeded decimal
places)
• Use zip file to transfer data
This has improved data load time by a factor of ~50-100 times (sorry no
hard numbers).
My data ended up being 430KB for ~32K rows, just over 13 bytes/row.

Example data:
C,007,Australia,Oceania,1820,2007
3413340017010
3413310017070
3413290017280
3413290017530
3413320017950
3413330018330

As traditional CSV text, this would look like:
CountryID,Year,LifeExpect,Population,GDP,CountryName,RegionCode,RegionName
007,1820,34.1,0000334000,000701.0,Australia,4S,Oceania
007,1821,34.1,0000331000,000707.0,Australia,4S,Oceania
007,1822,34.1,0000329000,000728.0,Australia,4S,Oceania
007,1823,34.1,0000329000,000753.0,Australia,4S,Oceania
007,1824,34.1,0000332000,000795.0,Australia,4S,Oceania
007,1825,34.1,0000333000,000833.0,Australia,4S,Oceania

There are three row types in the file:
Lines beginning with "C" are CSV country header lines - Like:
  C,007,Australia,Oceania,1820,2007
The values being:
  - C: Header
  - 007: Country number
  - Australia: Country name
  - Oceania: Country region
  - 1820: First year there is data
  - 2007: Last year there is data

Lines starting with 0-9 are data for one individual year for the above
country
  - The year is assumed to increment for every detail line
  - These detail lines are always 13 digits wide, fixed width fields, no
field separator, like:
           341 334001 7010 (spaces added for clarity, not in actual file)
  - Life expectancy (x10), example: 341 = 34.1 years
  - Population (last digit is exponent multiplier) 334001 = 334,000; 334002
= 3,340,000.
    The last digit is effectively the number of zeros to add at the right
hand side.
  - GDP (per person, last digit is exponent multiplier) 7010 = $7,010; 7011
= $70,100.
     Again, the last digit is effectively the number of zeros to add at the
right hand side.

You need to be careful with this technique, how much data can you afford to
“lose” due to data rounding.

You were looking for “getting the data across with the least suffering and
complexity”, my complexity was continual refining to more and more simple
data structures, that were more and more looking like a data structure from
a 1960’s COBOL program when storage was expensive and processing was slow.

In hindsight, I feel that I still sent more data than I needed to down the
wire, I could have taken one digit off the age range, two digits off the
population and one digit off the GDP, saving another 4 bytes per row. Also,
could have used base 64 numbers, that would have given me another ~4 bytes
per row.  But the performance was fine with this structure, so I did no
more to cut it back.

WARNING: This worked fine with my specific smallish well known data set, if
I was putting this out into customer land, I would allow for a wider range
of values.  For example, if we were to need to express the values in
Indonesian Rupiahs rather than US Dollars, the amounts would go up by a
factor of 10,000 and my values would no longer fit.  My values only work
for large positive numbers, no room for a negative sign in front of the
number or the exponent.

So you need to design a file format that will work for your specific
situation and data and keep an eye on it to make sure it stays working.

After having done all of this, I am tempted to see what the performance
would be like with just simple raw CSV, if I was going to re-code this
today, that is what I would start with.

Regards
Greg #2 Harris


On Tue, Aug 6, 2013 at 6:00 PM, Greg Keogh <g...@mira.net> wrote:

> Folks, I have to send several thousand database entities of different
> types to both a Silverlight 5 and WPF app for display in a grid. I can't
> "page" the data because it's all got to be loaded to allow a snappy
> response to filtering it. I'm fishing for ways of getting the data across
> with the least suffering and complexity ... don't forget that Silverlight
> is involved.
>
> Does a WCF service with http binding allow streaming? That would be the
> ideal technique if it comes out of the box and isn't too complex.
>
> I ran an experiment to convert ~6000 entities into XML and the size is a
> hefty 6MB (no surprise!), however Ionic.Zlib deflates it down to a 500KB
> buffer which transmits acceptably fast. I'm unhappy with my code to round
> trip the entities-to-XML as it's a bit messy and has special case logic to
> skip association properties.
>
> Then I thought of Json, which I haven't need to use before. Would the
> Jason libraries make round-tripping easier? Are the built-in Framework
> classes good enough, or would I need to use something like NewtonSoft? Can
> I control which properties are processed? Any general ideas would be
> welcome.
>
> Greg K
>
>
>

Reply via email to