IAP Tools for D

2015-12-16 Thread Jakob Jenkov via Digitalmars-d-announce

Hi D Community,

I am currently working on a cloud project where we intend to 
reinvent a lot of the old, less-than-optimal technologies. Among 
the technologies we are working on is a new general purpose 
network protocol called IAP.


IAP comes with a general purpose binary data format called ION 
(IAP Object Notation). ION is similar to MessagePack and CBOR, 
but with a few additions. ION has a table mode which can be used 
to model tables (like CSV files) efficiently, and which can also 
be used in larger object graphs. Our early serialized length + 
performance benchmarks look promising (tables can be down to 1/5 
of JSON, and up to 2 x the speed of parsing CBOR).


ION can be used both inside IAP, but also separately with HTTP 
and in data and log files.


We already have a working toolkit in Java (we have Java 
backgrounds), but since we really find D interesting, we would 
like to make a D toolkit too.


Since we are rather new to D, would anyone be interested in 
helping us a bit out making such a library? We can probably do 
the coding ourselves, but might need some tips about how to pack 
it nicely into a D library which can be used with Dub etc.


Re: IAP Tools for D

2015-12-16 Thread Jakob Jenkov via Digitalmars-d-announce
If you hop onto IRC #d Freenode, there maybe somebody from time 
to time that can give you a hand. Or at worst help solve some 
of your problems.


Thanks!

Oh, I forgot to tell that the IAP Tools for D library will be 
open source, Apache 2 License.




Re: IAP Tools for D

2015-12-16 Thread Jakob Jenkov via Digitalmars-d-announce

Sounds like an interesting thing. I will lend a hand.


Great! We probably won't get started until January, as we have 
some documentation work to do on the Java library still, and some 
more systematic benchmarks to run etc. We will announce it here 
again when we get there.


A GitHub repo would suffice, right?



Re: IAP Tools for D

2015-12-19 Thread Jakob Jenkov via Digitalmars-d-announce
How does the performance of ION compare with Protocol Buffers 
(https://developers.google.com/protocol-buffers/?hl=en) and 
Apache Thrift ( https://thrift.apache.org/)?


That depends on what API you use, and how much "meta data" (e.g. 
class names and property names) you write in the serialized ION 
data. ION is quite flexible about how much meta you want to 
include.


If you remove property names and rely only the sequence of 
fields, ION can write faster than Google Protocol Buffers. When 
reading, if you only rely in the sequence of fields, ION is a bit 
slower than Google Protocol Buffers. All in all I believe 
performance will be on-par with Google Protocol Buffers.


We have some benchmarks here:

http://tutorials.jenkov.com/iap/ion-performance-benchmarks.html

We still have a few minor optimizations to do, and more 
benchmarks to run, but perhaps also some validations to add etc, 
so the benchmarks on this page (for Java) are probably not too 
far off from the final numbers.


Regarding Apache Avro and Thrift, I looked at them today. It 
seems that Avro's encoding is similar to ION (and MessagePack and 
CBOR), although without e.g. tables. According to Thrift's own 
docs their binary encoding is not compact. For compact encoding 
it seems they refer to Protobuf.


ION has several advantages over Protobuf as a general purpose 
data format. ION is self describing, so you can iterate it 
without a schema. This means that you can do pretty fast 
arbitrary hierarchical navigation of an ION "file/message".


Protobuf's own docs say that Protobuf is not good for large 
amounts of raw bytes (e.g. files). ION is capable of modeling 
both raw binary data (e.g. files), JSON, XML and CSV efficiently. 
You could even convert ION to a restricted XML format, edit it in 
a text editor, and convert it back to ION (we have not 
implemented this yet, but we have planned it). We also believe 
that ION can support cyclic object graphs, but this is also not 
fully implemented and tested yet.


ION has a very compact encoding of arrays of objects in "Tables" 
which are similar to CSV files with only 1 header row, and N 
value rows. It is very common to transport arrays of object over 
the network, e.g. N search results from a service. Thus ION 
tables are a major advantage. Tables can also be used inside 
object graphs where an object has 0..N children (in an array).


We have a comparison of ION to other data formats here:

http://tutorials.jenkov.com/iap/ion-vs-other-formats.html


Re: IAP Tools for D

2015-12-19 Thread Jakob Jenkov via Digitalmars-d-announce
How does the performance of ION compare with Protocol Buffers 
(https://developers.google.com/protocol-buffers/?hl=en) and 
Apache Thrift ( https://thrift.apache.org/)?


Oh - one final thing:

If you *really* want speed you should not parse ION into objects 
before using the data. Since ION is self describing, you can just 
navigate through it and find the data you need, and ignore the 
rest.


This should be faster than first parsing the data into objects 
first. Especially if you parse an array of objects which may end 
up scattered all over the heap, and thus lead to cache misses. 
Accessing these objects directly in the message buffer might save 
you both the ION-to-object parse time, plus it might play better 
with the L1, L2 and L3 caches.


We have not yet benchmarked this, but we will within long. In 
this mode I expect the read+use time to be faster than Google 
Protocol Buffers.


Re: IAP Tools for D

2015-12-20 Thread Jakob Jenkov via Digitalmars-d-announce

I suggest to compare also against this [1].
The author, Kenton Varda, was the primary author of Protocol 
Buffers version 2, which is the version that Google released 
open source.


[1] https://capnproto.org


Will do - at some point. Writing proper benchmarks against other 
frameworks / encodings takes time though. That's why we have 
started with MessagePack, CBOR and Google Protocol Buffers.




Re: IAP Tools for D

2015-12-20 Thread Jakob Jenkov via Digitalmars-d-announce

I suggest to compare also against this [1].
The author, Kenton Varda, was the primary author of Protocol 
Buffers version 2, which is the version that Google released 
open source.


[1] https://capnproto.org



I just had a look at Cap'n Proto. From what I can see in the 
encoding spec, performance of ION will be comparable.


Cap'n Proto claims to be "infinitely faster" than Google Protocol 
Buffers, but that is only if you do not pack the CP data - in 
which case it will transfer slower over the network. CP solves 
that using packing - but then you are back to serialization / 
deserialization, and the original promise of being "inifinitely 
faster" is gone.


Cap'n Proto also has the "problem" that its messages require an 
external schema. To iterate through a Cap'n Proto file / message 
you must already know what data is in it (the schema).


Some see this as an advantage, because it forces you to write a 
schema for your data structure, and you get slightly faster 
encoding / decoding time.


And others see this is a disadvantage because you now have to 
import schemas, or generate code, in order to read a serialized 
message. You cannot just step through it like you can with e.g. 
XML or JSON. I tend to be in this camp - although I am not blind 
to the arguments in favor of external schemas. Speed matters, but 
so does ease-of-use.


On a network protocol level I tend to disagree with the 
"distributed object" model. I know Capn Proto tries to explain 
why this model is not a problem with CP. However, fine grained 
communication between fine grained distributed objects *is* a 
performance killer in the long run, regardless of whether you 
"pipeline" requests.


ION is intended to be the message format for our IAP network 
protocol. IAP will be message oriented, so you can do one-way 
messaging, request-response, subscriptions (e.g. to a stream), 
pipelining, routing of messages via intermediate nodes etc.


Anyways, if you really want to use Cap'N Proto (or something 
else) over IAP (+ION) you can just nest a binary message inside 
an IAP message, and then parse it any way you like when it comes 
out.


Re: IAP Tools for D

2015-12-20 Thread Jakob Jenkov via Digitalmars-d-announce
On Sunday, 20 December 2015 at 19:16:19 UTC, David Nadlinger 
wrote:

On Sunday, 20 December 2015 at 01:16:46 UTC, Jakob Jenkov wrote:
According to Thrift's own docs their binary encoding is not 
compact. For compact encoding it seems they refer to Protobuf.


There seems to be a confusion of terminology here. Thrift has a 
"Binary" protocol, which is not compact in the sense that it 
consists of the data fields more or less blitted into a 
message. There is also a "Compact" protocol, which is also a 
binary format, but employs things like variable-length integers 
to reduce size –  similar to Protobuf.


 — David


Thanks for the clarification! I couldn't really make out from the 
Thrift website if they had their own compact protocol, or 
switched to Protobuf. But now you say that they do have their own 
compact protocol. Now I know that.


Re: IAP Tools for D

2015-12-20 Thread Jakob Jenkov via Digitalmars-d-announce
The designers of HTTP would strongly argue that is a major 
thing HTTP got right, and is the feature primarily responsible 
for it huge success.


Then why is HTTP 2 moving away from it? And Web Sockets?
Clearly, having the choice between keeping state and not keeping
state is preferable to HTTP taking that choice away from you.

Lots of apps also spend quite an effort to mimic stateful 
communication

on top of HTTP. Sessions? Authentication tokens? Cookies? Caching
in the browser? HTML5 Local Storage?

No, HTTP did not get "stateless" right.


Your "fix-the-network" problem is definitely valid.

At this point we have mostly focused on ION - the binary object / 
message format for IAP.
However, we have a pretty good idea about how IAP will work on a 
conceptual

level.

IAP will have a set of "semantic protocols". Each semantic 
protocol can address
its own area of concern. File exchange, time, RPC, distributed 
transactions,

P2P, streaming etc.

You can also define your own semantic protocol to address exactly 
your specific
situation (e.g. the Byzantine Generals Problem - distributed 
consensus).


Everything is not yet in place - but we will get there step by 
step.


Re: Three Cool Things about D

2015-12-23 Thread Jakob Jenkov via Digitalmars-d-announce
On Monday, 21 December 2015 at 17:28:51 UTC, Andrei Alexandrescu 
wrote:

https://www.reddit.com/r/programming/comments/3xq2ul/codedive_2015_talk_three_cool_things_about_d/

https://www.facebook.com/dlang.org/posts/1192267587453587

https://twitter.com/D_Programming/status/678989872367988741


Andrei



Interesting talk :-) Watched it while cooking.


Re: So You Want To Write Your Own Language

2015-12-24 Thread Jakob Jenkov via Digitalmars-d-announce
On Thursday, 24 December 2015 at 01:08:38 UTC, Walter Bright 
wrote:

This has resurfaced on Reddit:

https://www.reddit.com/r/programming/comments/3xya5v/so_you_want_to_write_your_own_language/



Hi Walther, interesting article. I guess it's like with 
entrepreneurship in general. It's a lot of work and lots of 
people will tell that you don't have what it takes, it won't 
work, you can't beat the big guys etc. But, as you progress and 
they see the results, more and more of them change their "no" to 
"maybe", "hmm..." and "yes".


I am working on a cloud project where we will also need to 
implement a little language that can run inside our cloud. The 
constraints are quite different from a general purpose language 
in terms of compilation / interpretation time, memory usage etc. 
so the design will probably be different than e.g. D.


I am looking forward to this project. Yes, it's geeky, and yes, 
it will probably "suck" in the first versions - but eventually we 
will get there, and it will work just fine.


Re: So You Want To Write Your Own Language

2015-12-24 Thread Jakob Jenkov via Digitalmars-d-announce
On Thursday, 24 December 2015 at 16:37:29 UTC, Jacob Carlborg 
wrote:

On 24/12/15 02:08, Walter Bright wrote:

This has resurfaced on Reddit:

https://www.reddit.com/r/programming/comments/3xya5v/so_you_want_to_write_your_own_language/


In the comments, about the cluttered syntax. For the 
attributes, due to legacy reasons, it seems like D got all the 
defaults wrong. System instead of safe, mutable instead of 
immutable, not pure instead of pure and so on. We might not be 
able to get rid of any attributes but if some of these defaults 
were different perhaps it would not be necessary to use so many 
attributes all the time.


I know that many here don't agree but personally I think the 
language could have less syntax it had AST macros. Some syntax 
that is built-in now could be moved to library code in the form 
of macros.



I think it depends a lot on your personal preference. For 
instance, I am always annoyed about immutable types being forced 
upon me (okay, they wouldn't be forced, but I'd have to work to 
get rid of them). I like mutable types.


Regarding the AST macros - I simply don't know enough about how 
that works in practice to have an opinion. Java doesn't have that 
stuff, so I don't know what I am missing :-)


Re: DLanguage IntelliJ plugin released

2015-12-25 Thread Jakob Jenkov via Digitalmars-d-announce

On Friday, 25 December 2015 at 10:41:26 UTC, Kingsley wrote:

Hi

I have released an initial attempt at an IntelliJ plugin for D 
to the jetbrains plugin repository.


It's DLanguage version 1.2

It has basic syntax highlighting, autocompletion with DCD, 
compile checking and linting with Dscanner, code formatting 
with Dfmt and navigation jump to classes and functions and dub 
support - recommend using dub


See the GitHub page screenshots for an idea

Enjoy :)

--Kingsley



Cool - I am a long time IntelliJ IDEA user :-) I will give it a 
try!