As rustpkg is still in its infancy, most Rust code tends to be built with make, other tools, or by hand. I've been working on updating Servo's build system to something a bit more reliable and fast, and so I've been giving a lot of thought to build tooling with regards to Rust.
In this post, I want to cover what the current issues are with building Rust code, especially with regards to external tooling. I'll also describe some recent work I did to address these issues. In the future, I want to cover specific ways to integrate Rust with a few different build tools.Current Issues
Building Rust with existing build tools is a little difficult at the moment. The main issues are related to Rust's attempt to be a better systems language than the existing options.
For example, Rust uses a larger compilation unit than C and C++ compilers, and existing build tools are designed around single file compilation. Rust libraries are output with unpredictable names. And dependency information must be done manually.Compilation Unit
Many programming languages compile one source file to one output file and then collect the results into some final product. In C, you compile .c files to .o files, then archive or link them into .lib, .a, .dylib, and so on depending on the platform and whether you are building an executable, static library, or shared library. Even Java compiles .java inputs to one or more .class outputs, which are then normally packaged into a .jar.
In Rust, the unit of compilation is the crate, which is a collection of modules and items. A crate may consist of a single source file or an arbitrary number of them in some directory hierarchy, but its output is a single executable or library.
Using crates as the compilation unit makes sense from a compiler point of view, as it has more knowledge during compilation to work from. It also makes sense from a versioning point of view as all of the crate's contents goes together. Using crates as the compilation unit allows for cyclic dependencies between modules in the same crates, which is useful to express some things. It also means that separate declaration and implementation pieces are not needed, such as the header files in C and C++.
Most build tools assume a model similar to that of a typical C compiler. For example, make has pattern rules that can take and input to and output based on on filename transformations. These work great if one input produces one output, but they don't work well in other cases.
Rust still has a main input file, the one you pass to the compiler, so this difference doesn't have a lot of ramifications when using existing build tools.Output Names
Compilers generally have an option for what to name their output files, or else they derive the output name with some simple formula. C compilers use the -o option to name the output; Java just names the files after the classes they contain. Rust also has a -o option, which works like you expect, except in the case of libraries where it is ignored.
Libraries in Rust are special in order to avoid naming collisions. Since libraries often end up stored centrally, only one library can have a given name. If I create a library called libgeom it will conflict with someone else's libgeom. Operating systems and distributions end up resolving these conflicts by changing the names slightly, but it's a huge annoyance. To avoid collisions, Rust includes a unique identifier called the crate hash in the name. Now my Rust library libgeom-f32ab99 doesn't conflict with libgeom-00a9edc.
Unfortunately, the current Rust compiler computes the crate hash by hashing the link metadata, such as name and version, along with the link metadata of its dependencies. This results in a crate hash that only the Rust compiler is realistically able to compute, making it seem pseudo-random. This causes a huge problem for build tooling as the output filename for libraries in unknown.
To work around this problem when using make, the Rust and Servo build systems use a dummy target called libfoo.dummy for a library called foo, and after running rustc to build the library, it creates the libfoo.dummy file so that make has some well known output to reason about. This workaround is a bit messy and pollutes the build files.
Here's an example of what a Makefile looks like with this .dummy workaround:RUSTC ?= rustc SOURCES = $(find . -name '*.rs') all: librust-geom.dummy librust-geom.dummy: lib.rs $(SOURCES) @$(RUSTC) --lib $< @touch $@ clean: @rm -f *.dummy *.so *.dylib *.dll
While this works, it also has some drawbacks. For example, if you edit a file during a long compile, the libfoo.dummy will get updated after the compile is finished, and rerunning the build won't detect any changes. The timestamp of the input file will be older than the final output file that the build tool is checking. If the build system knew the real output file name, it could compare the correct timestamps, but that information has been locked inside the Rust compiler.Dependency Information
Build systems need to be reliable. When you edit a file, it should trigger the correct things to get rebuilt. If nothing changes, nothing should get rebuilt. It's extremely frustrating if you edit a file, rebuild the library, and find that your code changes aren't reflected in the new output for some reason or that the library is not rebuilt at all. Reliable builds need accurate dependency information in order to accomplish this.
There's currently no way for external build tools to get dependency information about Rust crates. This means that developers tend to list dependencies by hand which is pretty fragile.
One quick way to approximate dependency info is just to recursively find every *.rs in the crate's source directory. This can be wrong for multiple reasons; perhaps the include! or include_str! macros are used to pull in files that aren't named *.rs or conditional compilation may omit several files.
This is similar to dealing with header dependencies by hand when working with C and C++ code. C compilers have options to generate dependency info to deal with this, which used by tools like CMake.
The price of inaccurate or missing dependency info is an unreliable build and a frustrated developer. If you find yourself reaching for make clean, you're probably suffering from this.Making It Better
It's possible to solve these problems without sacrificing the things we want and falling back to doing exactly what C compilers do. By making the output file knowable and handling dependencies automatically we make make build tool integration easy and the resulting builds reliable. This is exactly what I've been working on the last few weeks.Stable and Computable Hashes
The first thing we need is to make the crate hash stable and easily computable by external tools. Internally, the Rust compiler uses SipHash to compute the crate hash, and takes into account arbitrary link metadata as well as the link metadata of its dependencies. SipHash is not something easily computed from a Makefile and the link metadata is not so easy to slurp and normalize from some dependency graph.
I've just landed a pull request that replaces the link metadata with a package identifier, which is a crate level attribute called pkgid. You declare it like #[pkgid="github.com/mozilla-servo/rust-geom#0.1"]; at the top of your lib.rs. The first part, github.com/mozilla-servo, is a path, which serves as both a namespace for your crate and a location hint as to where it can be obtained (for use by rustpkg for example). Then comes the crate's name, rust-geom. Following that is the version identifier 0.1. If no pkgid attribute is provided, one is inferred with an empty path, a 0.0 version, and a name based on the name of the input file.
To generate a crate hash, we take the SHA256 digest of the pkgid attribute. SHA256 is readily available in most languages or on the command line, and the pkgid attribute is very easy to find by running a regular expression over the main input file. The first eight digits of this hash are used for the filename, but the full hash is stored in the crate metadata and used as part of the symbol hashes.
Since the crate hash no longer depends on the crate's dependencies, it is stable so long as the pkgid attribute doesn't change. This should happen very infrequently, for instance when the library changes versions.
This makes the crate hash computable by pretty much any build tool you can find, and means rustc generates predictable output filenames for libraries.Dependency Management
I've also got a pull request, which should land soon, to enable rustc to output make-compatible dependency information similar to the -MMD flag of gcc. To use it, you give rustc the --dep-info option and for an input file of lib.rs it will create a lib.d which can be used by make or other tools to learn the true dependencies.
The lib.d file will look something like this:librust-geom-da91df73-0.0.dylib: lib.rs matrix.rs matrix2d.rs point.rs rect.rs side_offsets.rs size.rs
Note that this list of dependencies will include code pulled in via the include! and include_str! macros as well.
Here's an example of a handwritten Makefile using dependency info. Note that this uses a hard-coded output file name, which works because crate hash is stable unless the pkgid attribute is changed:RUSTC ?= rustc all: librust-geom-851fed20-0.1.dylib librust-geom-851fed20-0.1.dylib: lib.rs @$(RUSTC) --dep-info --lib $< -include lib.d
Now it will notice when you change any of the .rs files without needed to explicitly list them, and this will get updated as your code changes automatically. A little Makefile abstraction on top of this can make it quite nice and portable.Next Up
The third XMPP/Realtime UK Meetup will be happening on the 2nd of December, 2013.
A summary of the event can be found at iea.sust.se
Some of our customers asked for NuGet packages. The roadmap of our development is heavily impacted by our customer requests. We are always receptive to your criticism, ideas and praise
You can find the MatriX NuGet package here:
In just 34 days the first full test run of ubiquitous security on the XMPP network will be attempted by many service operators.
Like the IPv6 test days, on the 4th January XMPP server operators are turning on TLS encryption for s2s and c2s connections and testing to see what doesn’t work and what needs more work.
The participants of this effort would like you to join others in the XMPP community and help secure users private communications.
They are inviting you to join other operators and secure XMPP.
Answers to common questions:
Q: how do I test my site’s security?
A: use http://xmpp.net to run a test against your domain. For help enabling full TLS encryption, check out the Securing XMPP wiki page or contact your XMPP server vendor.
Q: But what if things break?
A: This is a just a test. The changes will be rolled back on 5th January until the next test the following month.
Q: Can’t you test this all before and then switch?
A: In theory everything should work. In reality it’s better to test, rollback, fix, re-test.
Q: I heard that Google doesn’t do encrypted connections to non Gtalk servers.
A: True: server to server connections on Google network are inescure. The XMPP Board has reached out to Google at different levels and will continue to work with Google to find a way to keep XMPP interoperability with Google servers.
Q: Where do I discuss this?
A: Join the operators mailing list: http://mail.jabber.org/mailman/listinfo/operators
Q: I heard there is a manifesto?
A: Indeed – if you are a server operator and want to publicly show your support for secure user communications, sign up (with a pull request) at https://github.com/stpeter/manifesto
The operators are all looking forward to the go-live date of May 19, 2014 and excited for this huge step. Thanks for playing your part.
edit: Changed the tone and voice of the article as it had previously implied that the XSF itself was running this Test Day rather than it being the community generated event that it is.
In order to enable SASL External add following line to the init.properties filec2s/clientCertCA=/path/to/cacert.pem
File cacert.pem contains Certificate Authority certificate which is used to sign clients certificate.
Client certificate must include user's Jabber ID as XmppAddr in subjectAltName:
As specified in RFC 3920 and updated in RFC 6120, during the stream negotiation process an XMPP client can present a certificate (a “client certificate”). If a JabberID is included in a client certificate, it is encapsulated as an id-on-xmppAddr Object Identifier (“xmppAddr”), i.e., a subjectAltName entry of type otherName with an ASN.1 Object Identifier of “id-on-xmppAddr” as specified in Section 220.127.116.11 of RFC 6120. 
First Release Candidate of Tigase XMPP Server 5.2.0 has been published. Binaries are available for download right now in the files section on the project tracking system. Sources are available in our code repository under the tag 5.2.0-rc1 - tags/tigase-server-5.2.0-rc1. Maven artifacts have been deployed to our maven repository.
At the meetup event we will have sensors and actuators connected to raspberries and create interoperable interchange of data with the help of XMPP.
Interested in how XMPP could be used in the Internet of Things arena contact the organizer Joachim Lindborg XSF or Peter Waher XSF
Simon posted on the jdev mailing list a great reminder about the upcoming test day quoted below:
“We owe it to our user’s to provide secure communications” – St Peter.
To achieve ubiquitous encryption between clients and servers on XMPP, we’re
doing a phased approach similar to the ipv6 test days.
– 4th January is the first test day of full XMPP encryption. The aim -
see what breaks when all XMPP sites enable secure communication.
– February 22, 2014 – second test day
– March 22, 2014 – third test day
– April 19, 2014 – fourth test day
– May 19, 2014 – permanent upgrade
This mean that you should:
– Sysadmins: check your server’s security at http://XMPP.net
– XMPP developers: provide a bare minimum configuration to be able to
pass the tests at http://wiki.xmpp.org/web/Securing_XMPP
– Everyone: Sign the manifesto (via pull request):
I’m really excited to achieve this. It’s been too long in the making but we
have no better reasons than now to get it done.
Please also spread the word on how XMPP takes user privacy seriously.
A couple of days ago, my friend Tom asked me using GMail’s Google Talk widget why one bash command worked while another didn’t. The commands looked the same, but to make sure no UTF-8 silliness was going on, I checked Adium’s debug window. There, I noticed the messages both contained an XML element I didn’t recognize, google-mail-signature:1 2 3 4 5 6 7 <message type='chat' firstname.lastname@example.org' from='...@gmail.com/...' id='...' iconset='classic'> <body>...</body> <google-mail-signature xmlns='google:metadata'>JZRvRiVt_pz4h4l-VIms2ufrvbQ</google-mail-signature> <active xmlns='http://jabber.org/protocol/chatstates'/> <x value='disabled' xmlns='google:nosave'/> <record otr='false' xmlns='http://jabber.org/protocol/archive'/> </message>
This does not appear to belong to any of the officially documented Google XMPP extensions, so I tried to figure out what it is.
Zash pointed out that substituting _ with /, - with + and adding a = to fix the padding turns the data into valid base64 Florob pointed out that this is base64url encoded, with padding left out. Adding back one = of padding and base64 decoding it turned it into 20 bytes of random looking data. What sort of signature could that be?RSA or DSA?
Cryptographic signatures for email are often using RSA or DSA. It is unlikely this is an RSA signature: an RSA signature is always of the same length as the keypair and 160 bit RSA would be undeniably insecure (although Google has in the past signed their emails with a key not fully up to best practices). DSA signatures must contain a pair of two values, typically using ASN.1 encoding. That means it must always start with 0x30, which is not the case.
20 bytes is however exactly the length of a SHA-1 hash. Could it be the hash of a signature? That would make little sense, as nobody would be able to verify the signature from just the hash. It is likely either:
I observed the following:
I tried to reproduce the hash by testing 2.4 billion permutations of:1 2 3 SHA1(sender + delimiter + recipient + delimiter + message) SHA1(sender + delimiter + message + delimiter + recipient) ...
for (almost) all 1-4 character delimiters, all possible orderings of sender, recipient and message and a couple of different formats of sender and message I could think of. None of them matched the hash on the message.
Of course, it could be a different delimiter or encoding the message in a different way. But unless someone else manages to find how these hashes are calculated, it looks like only Google can generate them, using a deterministic procedure.HMAC-SHA-1
From the fact that the data is called “signature”, I think it is likely these are HMAC-SHA-1 MACs of the message generated using a key only Google has. This would make it impossible for anyone to generate a fake signature, unless they brute-forced the key used by Google. The difference between a MAC and RSA or DSA signatures is roughly the difference between symmetric and asymmetric encryption: to verify a MAC, you need the key that was used to generate it. To verify an RSA or DSA signature, you need only the public key, but you need the secret key to create that signature.Why is this dangerous?
Cryptographic signing isn’t automatically bad. Many emails are signed using DKIM and a number of people choose to sign their emails using GPG. But these are emails – most people will assume correctly these are stored forever and that the signatures can be verified forever.
IMs are different. Not only does it appear this feature is not documented anywhere, but people generally assume IMs are more ephemeral than emails. Signing them should not happen while people aren’t aware of it.
Even though nobody but Google is able to verify or fake these signatures, there is one signing oracle that can be used: Google. The fact that the signatures are deterministic means they can be later verified by sending the same message again.
If you ever use GTalk on GMail to send a message describing some illegal activities, that message will receive a signature from Google. If the recipient stores that message and signature, they have cryptographically verified blackmail material: they could later turn both message and signature over to law enforcement. Law enforcement could then take control over your account and resend the message to the recipient to verify the signature is correct, or they could try to force Google to verify the signature. If it succeeds, it proves cryptographically that your account sent that message. Of course they can’t prove the order or context of your messages, but one message in itself might be enough to get you into trouble.
I think Google should either:
With great pleasure we want to announce a new public service brought to you by yaxim.org - the yax.im (read: Yaks’ IM) public XMPP server.
To register with it, open your XMPP client (e.g. yaxim), choose a JID of your liking that ends with @yax.im, like email@example.com and activate “Register new account”.Server Details
The service is running the recent Prosody 0.9 XMPP server written in Lua. Many thanks go to the dedicated Prosody developers.
yax.im is reachable via IPv4 and IPv6 (if your client does not support SRV resolution, use yaxim.boerde.de as the server name, port 5222) and is hosted in Berlin, Germany. It features several extensions important for mobile clients:
Certain JIDs are barred from registration (you need to specify at least two letters, and test and admin, among some others, are disallowed).Transports and Services
So far, yax.im offers a built-in XEP-0045: MUC component at chat.yax.im with the yax.im service and yaxim app support chat room firstname.lastname@example.org. Feel free to bother us with any issues you might have with either the app or the server.
The server is not offering any transports. This might change if a transport implementation for XMPP or any proprietary service appears that has a credible security audit.
In the meantime, you are free to use whatever external transport you are used to.
Timothée and Vincent will be at the Cité des sciences, in Paris, to give a conference about the project. Named “Movim, réseau social et décentralisation”, you’ll be able to see the last version of the software and have a look on the decentralization advantages.
The podcast will be published as soon as possible.
Joachim is in the US (San Francisco specifically) and would love to get together with anyone from the XMPP and/or IoT community!
His original goal was to attend the Doing IoT with XMPP meetup. Either way you should poke him on the members@ mailing list to at least find a pub and raise a pint or three!
The XSF held it’s annual meeting the 29th of October and we voted for Council and Board
I’m happy for two reasons, first that my friend and coworker at &yet was elected to the Council and that I was again elected to the Board.
With all of the activity in the web space and all of the people now realizing that the internets need security and that XMPP is already a federated, secure suite of protocols - it will be a very busy year I think.
Peter Saint-Andre has created a Manifest for others to join, debate and discuss about a plan for upgrading the XMPP network to always-on, mandatory, ubiquitous encryption.
To quote Peter:
In short: we owe it to those who use XMPP technologies to improve the security of the network (and thanks to Thijs Alkemade, we now have better ways to test such security, using the newly-launched “IM Observatory” at xmpp.net). Although we know that channel encryption is not the complete answer, it’s the right thing to do because it will help to protect people’s communications from prying eyes.
Every year the members of the XSF get together to vote on the current quarter’s new and renewing members and to also elect who will become members of the Technical Council and who will server on the Board of Directors.
This year that meeting was held on the 29th of October, 2013 and Alexander has recorded the details in a on the XSF site.
The 13th XSF Technical Council for the 2013/2014 term are:
The 2013/2014 Board will consist of:
Please congratulate them if you run across any but also please help us make this another great year for the XSF.
Earlier this year we announced the open source Pushpin project, a server component that makes it easy to scale out realtime HTTP APIs. Just what kind of scale are we talking about though? To demonstrate, we put together some code that pushes a truckload of data through a cluster of Pushpin instances. Here's the output of its dashboard after a successful run:
Before getting into the details of how we did this, let's first establish some goals:
TechNet is an annual event organised by AFCEA Europe (Armed Forces Communications & Electronics Association) to highlight IT developments in the military sector. TechNet 2013 was held on 23/24 October at the Lisbon Congress Centre. This year the focus was on connecting coalition forces and bringing high-end IT capabilities to deployed forces using mobile devices.
Normally we attend these events to listen to the talks and network with other attendees. This year we were invited by our friends at HP Autonomy to share their stand for a joint HP/Isode demonstration showing Autonomy analysis of M-Link chat files.
M-Link can provide chat log archives, which can be configured to audit all messages sent either by user, by MUC room, or both. The archives are stored in an XML-like syntax on disk.
The TechNet demonstration showed how a link between M-Link’s chat log files and Autonomy could enable the analysis of real-time chat data to search for developing patterns, sentiment and intention.
Both Isode and HP Autonomy will be showing this demo at future events.