(July 2013)
For the TL;DR crowd:If your programs are sending and receiving data structures, you are probably using high-level encapsulations - e.g. XML or JSON.There are however far more optimal solutions in terms of memory, CPUand network requirements ; ASN.1 is one of them. Below you will find a short introduction into the why and how of serializationwith ASN.1, as well as hands-on sessions with an open-source ASN.1 compilerimplemented under the auspices of the European Space Agency. |
Download The ASN.1 Compiler for free. It works under 64-bit systems in Windows, Linux and macOS environments, and can analyze source code intended for. Intel Visual Fortran Compiler Professional Edition delivers rapid development and winning performance for the full range of Intel processor-based platforms. It provides the most comprehensive multicore and standards support with more parallelism development features; more Intel Advanced Vector Extensions.
(Do you already know about ASN.1 and/or XML/XSD/JSON? Then feel free to skipthis section. Otherwise keep reading).
If you code for a living, you will inevitably end up in a position where twoprocesses are communicating over a link. The link-layer technology itself isnot important in this discussion: it can be sockets, or pipes, or whatever elseyou fancy. What is important, is how you handle the problem of sending yourdata 'across the wire'.
In the simplest of cases, you are writing the code at both ends - codingboth the server and the client, in the same language.
In that case, things appear easy enough - for example, assuming you write in C/C++ andsend data over sockets, you can just send a memory dump of the message itself:
(how we would handle messages with pointers or references inside them is left as an exercise for the reader - and yes, ignore packet fragmentation for now).
Looks easy and safe enough - block until the data of the structure are read, and work with them.
Until it becomes clear that..
msgId
to match the target CPU's endianess.sizeof
- which wouldmean overwriting the stack in the recipient..That's when the real 'fun' begins.. Unpredictable behaviour, based on whetherour compilers emit code that detect overflows at run-time by checking buffercanaries ('magic' signatures at both ends of each buffer), or, far worse, emit debugging information in the executable that 'hides' the bug.. becausewe overwrite that information instead! - etc, etc..Some of you may be thinking: 'Yeah, C/C++ causes this kind ofmayhem ; use a higher level language'. It could be argued however that thereare a lot of markets where using low-level languages like C/C++ is mandatory. Embedded systems are a particularly good example - weak CPUs, little memoryavailable..
But OK, I'll follow along - let's use Python instead:
(if you are thinking about it, for now ignore the fact that the recv() callmay fetch only part of the serialized message because of packet fragmentation. For thepurposes of this example, assume that we get the exact packet of data sent bythe sender in that single recv() call).
This is a definite improvement: Python comes pre-loaded with a generic serializer. We no longer have to care about what exists insidemyvariable
: Integers, float, strings, lists, dictionaries - theywill all safely migrate across different platforms, different CPUs' endianess,different CPU word sizes, etc:
We have no clue about how pickle.dumps
encodes things, but we don't care ; aslong as they decode fine on the other end, why should we?
Great.
But does this mean that we will code all our software in Python from now on?
I wish :‑)
As a developer, you will inevitably come to the position where you need towork with other people, that don't care for your preferred language.
Think about it..
How can you send your data structures across,to a program that is written in a completely different language?
Well, if you are really patient and have lots of resources to waste,you can design your own handmade encoding, and manually createencoders and decoders in all the languages you use - taking great pains to make sure the data are transferred correctly in all possiblecombinations of CPUs, platforms, etc.
Or you could use XML, JSON, or similar generic message representations. But that option too, comes at a price:
. Dettagli Torrent globale Commenti (0) - Eminem - Venom (2018) 320 kbps lossless - Download via torrent: Categoria bittorrent: Musica: Descrizione: VENOM SOUNDTRACK (BY EMINEM) Genre: Soundtrack Date: 2018 Country: USA Audio codec: MP3+FLAC Quality: 320 kbps+lossless. . Dettagli Torrent globale Commenti (0) - Venom Discografia, MP3 320 KBps VBR Lossy Black Metal - Download via torrent: Categoria bittorrent: Musica: Descrizione: Album 01. 1981 - Welcome To Hell (320) 02. 1982 - Black Metal (320) 03. 1983 - At War With Satan (320). 1993 - In Memorium - The Best Of Venom (192) 13. 1993 - Skeletons in. Venom - Discography Country: United Kingdom Genre: NWOBHM / Black / Speed / Thrash Metal Quality: Mp3,320 kbps (CDrip+Covers) Albums: 1986 - The Singles 80-86 (Original Edition) 1981 - Welcome. Download The Rolling Stones - Discography 1964-2009 Mp3 320 kbps TNT Village torrent for free, HD Full Movie Streaming Also Available in Limetorrents. VENOM SOUNDTRACK (BY EMINEM) Genre: Soundtrack Date: 2018 Country: USA Audio codec: MP3+FLAC Quality: 320 kbps+lossless Playtime: 04:30 1. Venom (Music From The Motion Picture) (4:30) #Zomboy - To avoid fakes, ALWAYS check that the torrent was added on ExtraTorrent.ag by Zomboy https://extratorrent.unblockit.me/.
Or..
You saw above that Python did a marvelous job of hiding the message encodingdetails by automating the handling of different types in the pickle
module. Wouldn't it be fabulous if we had this kind of machinery across different languages?
Guess what - we do. Since the 1980's, in fact - it's called ASN.1.
The idea behind it is very simple: specify your exchanged message data types in a data description language:
The language uses simple constructs to describe data types [1]. SEQUENCEs are what you would call struct
s or record
s in other languages - and as you can see, they contain descriptions of their fields. The usual basic types are there: BOOLEAN, INTEGER, REAL, ENUMERATED, OCTET STRING, etc - and SEQUENCEs can contain not only them, but also other SEQUENCEs, or arrays (SEQUENCE OFs).
Once we have written our ASN.1 grammar, we then feed it to an ASN.1 compiler - a tool that reads the specification, and emits, in our desired target language(s), (a) the language-specific type declarations, and (b) two functions per type: an encoder and a decoder, that encode and decode type instances to/from bitstreams.
First, download the Data Modelling Tools ; a tool suite that contains a free, open-source ASN.1 compilerdeveloped under the supervision of the European Space Agency.
Installing it is easy:
The install script will tell you if you are missing any dependencies,and suggest installing them. It will also indicate what to do next:
I am using bash, so I follow the second path:
The ASN.1 compiler is up and running [2].
Here's what the ASN.1 compiler creates when it is invoked on our simple ASN.1 grammar:
So, a number of .c and .h files were generated, which GCC then successfully compiled.
The 'gateway' - the only file you need to care about - is sample.h
. Remember in our description above, we said:
.. an ASN.1 compiler reads this specification, and emits, in our desired target languages, (a) the language-specific type declaration, and (b) two functions per type: an encoder and a decoder, that encode and decode type instances to/from bitstreams.
This is the type declaration that ASN1SCC generated for our message:
asn1SccSint
is a typedef inside the Run-time library (asn1crt.h) - and is defined as a 64bit int
. Similarly, flag
is typedef-ed to bool
. So, the ASN.1 compiler generated a semantically-equivalent transformation of the ASN.1 grammar, into our target language's declaration of the corresponding types.
We also said the ASN.1 compiler generates two functions - an encoder, and a decoder. And indeed:
pickle.dumps
function)...then the ASN.1 compiler generates an error code definition:
..and it is this error code that will be stored inside pErrCode
, if we violate the constraint. That is, if we call Encode with an invalid value inside the .msgId
field of the val
argument, the encoder will report this error code.
In case you missed it, or it wasn't clear enough:
In ASN.1, we can specify not only the field types, but also limits on their values - and have them automatically checked!
And that's the main idea. You can use this generated C code in your projects - it will just work. There are no external dependencies, no libraries to speak of, the code is there, open, for you to use as you please. Note that the encoders will properly handle all manner of potential mischief: endianess of the platform you compile it on, word sizes, etc. You can be sure that by using ASN.1, your encoded messages (that is, the representations inside the ByteStreams) can be sent to whatever platform you fancy, and they will decode fine, into the receiving platform's variables.
(Note: ASN1SCC is made specifically for embedded, safety-critical systems, so it only addresses ASN.1 grammars containing bounded (in size) messages. ASN.1 itself has no such limitation - e.g. you can model open-ended SEQUENCEOFs with it).
Actually, that's what ASN.1 was built for : to allow easy specification of all the messages that will be exchanged between your apps, regardless of their complexity. Here's a more advanced example, showing ENUMERATED types, nesting inside SEQUENCEs, etc:
The language is fairly simple, so you should be able to figure out what is going on. If not, you can study Olivier Dubuisson's freely available book for an extensive treatment of ASN.1 - all the way up to advanced features.
Processing this grammar with any ASN.1 compiler (including ASN1SCC), you would be all set to use the ready-made message definitions for AComplexMessage
, TypeEnumerated
, .. and their corresponding encoders/decoders.
In plain words - the complexity and the number of messages don't matter when you use ASN.1.
Yes, there are ASN.1 compilers for almost any language you can think of. ASN1SCC in particular, has been developed under the supervision of the European Space Agency, and it targets C and Ada, with specific emphasis on embedded, safety-critical systems - for which it does some pretty amazing things:
malloc
is ever called; what would you do in space when your satellite runs out of heap? Blue screen? :‑)Suffice to say, if you are involved with embedded development, it's worth taking a look.
Well, it depends on your definition of 'better'.
If you value optimal encoding/decoding performance, minimal encoded message size, guarantees of code safety, and minimal power requirements for encoding/decoding messages, then no, XML is most definitely NOT better. That's why your mobile phone has used ASN.1 encoding while you were reading this article. I am not kidding - almost every single signalling message that your phone sends to the local cell tower, is encoded via ASN.1!
If on the other hand..
.. then yes, XML/JSON may be a better match for you.
Let me repeat: If you care about optimal encoding/decoding performance, optimal memory use, ..
Remember, when we are speaking about ASN.1, we are looking at technology that was built by the Ancients. Being optimal wasn't a choice, back then - it was mandatory. You didn't have resources to waste. When you use ASN.1, you simply automate the parts of message marshalling that can be automated, without losing any performance or wasting any memory.
Do a low-level comparison of ASN.1 with any other technology that involves marshalling, and I guarantee you will be paying something at runtime: memory use, performance, or both.
ASN.1 comes with a set of predefined rules, that specify how encoding is done. You choose one when you invoke the ASN.1 compiler on your ASN.1 grammar - ASN1SCC, for example, currently supports four:
Note that choosing encoding has zero impact on your type declarations - you can switch between encodings without changing anything in your user code, except the name of the encoding/decoding function you call: e.g. instead of the default encoding (UPER), where you call Message_Decode
, you'd call Message_BER_Decode
- etc.
ASN.1 basically allows your programs to communicate with implementations coded in other languages, by establishing common ground - through a simple [1] data definition language. If you use it, arbitrarily complex messages are handled easily: think of arrays containing unions containing structs - or, in ASN.1 parlance, SEQUENCE OFs containing CHOICEs containing SEQUENCEs). You don't have to ever implement any serializers/deserializers of the messages, and provided you use a compiler like ASN1SCC, you also get guarantees of correctness, type safety and performance - for free.
Equally important - and this is a matter for another blog post, but consider this a teaser - by using ASN.1 to define your messages, you can then automatically create many things that are depending on the message definitions: in the case of the work I've done for the European Space Agency, we've built automatic translators of ASN.1 messages towards..
Code generated by modelling tools: building automatic 'translators' of code generated by modelling tools to/from ASN.1, we can then use ASN.1 as the center of a star-formation, and have code generated by one modelling tool automatically 'speak' to code generated by another, at runtime:
SQL databases: we can automatically store and retrieve arbitrarily complex ASN.1 messages inside automatically constructed databases (no, not via BLOBs - with tables that perfectly mirror the relationships between types using foreign keys:
Automatically generated graphical user interfaces receiving and sending TM/TCs (Telemetry and Telecommands) to and from our satellites.
Automatically generated graphical message tracers to allow for graphical monitoring of what happens to a running system - and why.
Automatically generated Interface Control Documents (ICDs) that describe the binary representation of the stream - for those people, that for whatever reason, choose to write encoders/decoders on their own:
..and loads more. Visit this for a 5 min showcase, or the official site for more info. You can also peruse the manual and see how ASN.1 (and AADL) allowed us to describe complex real-time embedded systems and their messages, in succinct ways that generate optimal code.
The following is a hands-on example of how our SWIG-based Python mapper - bundled in our DMT tools - wraps around the UPER C functions generated by ASN1SCC, and allows Python code to speak with ASN.1 UPER encoded data:
.. which then gives you a full API over your ASN.1 types, through Python classes:
The code shows a full round-trip that passes from structure, to byte buffer, and back to structure.
Similarly, here's our SQL mapper, automatically generating the SQL schema for storing/retrieving our messages:
As you can see, the transformation is also converting ASN.1 constraints to SQL constraints - and more importantly, works regardless of the complexity of the message.
Putting it simply: ASN.1 is another technology that is optimal for certain problem domains - and yet people will ignore it and pay the penalty in performance, memory and robustness.
By modelling your system's messages, ASN.1 also allows for lots of automatic code generation. In our case, we identified tremendous opportunities for automation, and have made a number of ASN.1-based code generators, that, among other things, automatically..
Fellow developers, have a look. You may find out ASN.1 can make your work easier, simpler, and more efficient.
If you are unfortunate enough to come into contact with lots of Telecom standards, you will often see how a simple and useful idea can become ridiculously complex as feature creep takes hold. To whatever extent you can, resist this - in the case of ASN.1, I humbly suggest that you use only the basic principles: type specification and constraints.
The compiler is an F# application, so it can run (via Mono) under Linux and OS X, or natively under Windows (to put it simply: it works on all major platforms). Why it is written in F# is a matter for another blog post - suffice to say, OCaml (the mother of F#) is a language with a strong type system that prevents many potential issues, detecting them at compile-time. It is very important for any code generator (and an ASN.1 compiler is exactly that!) to detect as many errors as possible at compile-time.
Since ASN1SCC targets embedded platforms, memory is an issue (we don't want to allocate stuff from the heap, since the heap may run out - when you're in space, what can you do if you run out of heap?..). The compiler therefore emits #define
s that allow us to reserve the necessary memory during compile-time:
Last update on: Sun Feb 21 12:56:49 2016 |
The comments on this website require the use of JavaScript. Perhaps your browser isn't JavaScript capable or the script is not being run for another reason. If you're interested in reading the comments or leaving a comment behind please try again with a different browser or from a different connection.