Usually I go by Floris, but Flub is
also fine, I don't mind.
Just go with Floris, it's okay.
Floris, okay.
I only have the Delta chat.
Maybe your email was your name, I don't
remember.
No, that's probably Flub.
Okay.
Okay, Floris, do you want to add a
last name or would you rather not?
No, it's fine.
I wouldn't be able to pronounce it anyway.
That's probably true.
I remember seeing your last name on your
presentation at FOSN.
Yeah, on the slides.
Yeah.
I was like, that's a long last name.
Welcome to the first episode of SolarCast.
I'm so happy to have you with us
here today.
And this may not be the first episode
that we're or I'm recording, but it is
the first episode to be released to the
public.
And I'm a little bit nervous.
But I'm sure it's going to be fantastic
because today's guests are wonderful.
It may take some figuring out to understand
exactly why, especially for those who are new
to these realms of the peer-to-peer
internets.
But we will get that plentifully explained.
Today's guest is Iroh.
That's spelled I-R-O-H.
To describe a bit about the project, explain
to us how it works.
And before that, we'll be looking a little
bit at the history of the internet, just
to get a bit of a context for
why hole punching has become so essential.
So with no further ado, let's dive into
the history of the internets.
Iroh evolved like in concept what it was
as well.
The very first version of Iroh was aiming
to be the second implementation of IPFS.
But then it kind of evolved into how
do we set up reliable connections.
And that is what Iroh is today.
It's like reliable direct connections.
The reason to do that part is because
we think it's a great enabler for creating
user agency.
That's something that we think is important in
building the internet.
And direct connections are a difficult thing to
do on the internet today.
And therefore, we decided if we can solve
that problem, then we can enable other application
developers to work on top of that and
to leverage that and hopefully build very cool
projects.
As Floris describes, connecting peer-to-peer is
nowadays quite difficult.
Unless you have solutions like Iroh, of course.
But it has not always been that way.
And as a little bit of an intro
and a precursor to the episode today, we're
going to go through a little bit of
the history of how we ended up here
and some of the terminology that will be
reoccurring in this episode.
So to start out, this is the first
in the series on peer-for-peer.
So we'll establish some of the base concepts
already now.
The first concept that might be good to
have a grasp on is peer.
What the hell is a peer?
So a peer would be a device, but
it can also mean a person.
In more like actor network theory, one might
call it an actor.
And in peer-to-peer systems, it would
mean that one device is connecting directly to
another device or one peer is connecting directly
to another peer.
In the origins of the Internet, back in
1969, there were actually four of these devices
or peers or computers connected to one another
like this.
By 1977, there were 23 of these nodes
in the network.
And by 1994, there were close to four
million, a form of exponential growth.
So what happened?
The whole issue that occurred was that we
ran out of IP addresses.
And the way this was solved back in
the 90s was to make changeable addresses.
Every computer had their own address, their own
IP.
And post-90s, there were these central middle
servers called domain name servers.
And these domain name servers would then hold
an address book of all the IPs to
the different computers in their regional networks.
But these IPs were no longer static or
fixed.
These IPs were instead variable and would change
every now and then.
This means that there was suddenly no direct
connection possible device-to-device unless you already
knew the IP address and the IP address
was not going to change.
Known static IP addresses right now is what's
often referred to the dark web.
We won't get into that very much at
this moment.
But there's nothing inherently dark about them.
They're simply just not listed.
That said, the large issue that we're now
to this day trying to solve is connecting
directly peer-to-peer.
And connecting directly peer-to-peer becomes important
when you don't necessarily know that you will
forever and ever be able to trust whoever
is in the middle.
So the issue we're having in a lot
of different nations and countries is where government
controls the domain name servers and thereby also
controls who can communicate with who, what people
can see, what people can do online.
So essentially this system of dynamic IPs and
network address translation is enforcing a hierarchy on
the internet, making sure that there are servers
and clients rather than peers connecting to peers.
So we're diving into one of the core
aspects of peer-for-peer networks, which is
offline-first, local-first, and peer-to-peer
communication networks.
And this episode will entirely focus on connecting
peer-to-peer.
So something that Floris brings up almost immediately
off the bat is IPFS.
And IPFS is one of these types of
peer-for-peer networks, trying to establish means
of communication that can be long-distance, can
route via alternative paths than just what we
know as the internet right now, but maybe
also Bluetooth and radio.
It stands for Interplanetary File Sharing System and
is probably the most well-known out of
all the peer-for-peer protocols until this
point, at least in time.
A new approach that is emerging is something
that IRO is also spearheading, which is the
ecosystem approach of networks, where different modules come
together.
And this focus will be on IRO.
Throughout the episode, whenever I hear terminology that
might be new, that might be confusing, I'll
do as best as I can, both for
myself and for the listeners, to give context
while also trying to give some form of
technical depth.
So bear with me if I don't go
deep enough.
Bear with me if I don't explain enough.
It's difficult to strike that gentle balance.
So no more from me and let's dive
in until next time something needs to be
explained.
Thank you.
But I guess IRO sort of refined, as
I said, it was an IPFS thing, but
it refined to basically the core thing that
we call IRO now is just a peer
-to-peer connection between two endpoints.
It is not even trying to make anything
bigger than that, just concentrating on two endpoints
and you want a direct connection between them.
And you want that direct connection to be
100% reliable.
You want it to be able to send
bytes on it from the very start without
having to wait.
And the trade-offs it makes are derived
from there.
And then it's up to the user to
build more complex things on top of that.
Like we do have some, I mean, basically
what IRO have been calling custom protocols.
And we do provide some of them because
partially because we already had them and partially
because they're useful.
The main ones, the two main ones, I
think are Blobs and Gossip, which is like
Blobs is like a verified streaming of any
kind of amount of data, I guess.
And then Gossip is like a broadcast network
to a larger number of nodes in your
network.
And then whatever else people want to build
on top.
But yeah, the core is really just two
endpoints, direct connectivity between that.
And that's already plenty of things.
And yeah.
Just so I can orient it among terminology
that I know, are you aiming for not
punching them?
Is that what you're solving?
Yeah, totally.
So direct connections is like hole-punching, not
hole-punching, whichever terminology is not traversal, I
think is the more IDFE term, I guess.
It's a huge thing you're solving.
Like it's been a big issue in peer
-to-peer community for quite a while.
So we decided to just make that the
main focus.
And so that it also doesn't compete with
the higher stacks that you can build on
top, kind of.
Yeah.
You mentioned also here that users would build
more complex things on top.
So I think in a lot of contexts,
when you say users, people are thinking of
the end users, right?
Yeah, end users.
Yeah, nice.
Could you talk more into who your users
are?
Yeah, so our users are basically people who
want to build peer-to-peer applications themselves,
not end users.
We make a base, yeah, a base technology,
a base protocol, but we need other people
to build on top of it because we
have a few demos that are sort of
a little bit end-user-y facing.
They're not our focus, and you can see
that in how much more you could do
with them and how much more user-friendly
you could make them, et cetera.
Actually, I think this weekend, someone turned one
of our demos into something way more polished
with the UI and everything.
So that's quite cool to see.
That's really cool.
It's always exciting when people pick up your
projects and then make cool stuff with it,
right?
Yeah.
So what you're focusing on polishing is not
necessarily that whatever interface people will be interacting
with.
It's the piece of the puzzle that other
peer-to-peer developers can utilize.
Yeah.
So establishing the direct connections or doing the
whole punching and giving you a reliable connection,
that is what we...
There's plenty of work to improve that, even
though it already works quite well, what we
have.
How did you end up there?
How did we end up there?
With that as a focus.
Because it's pretty rare to have a group
have the focus of end users being other
developers.
Unfortunately.
I don't know.
I guess it depends.
I think in the world I have been,
I don't know.
In what I'm used to, there are a
lot of places like that, I guess, where
all the developers are the focus.
But then I come from a background in
tools, et cetera.
So I guess maybe I'm very biased in
that respect.
But how did we end up there?
I mean, IRO started as a...
Initially, it was going to be a reimplementation
of IPFS.
Why?
Why is mostly social reasons, I would say.
Because the people who created IRO, the two
founders, came from that kind of background.
And IPFS was, at the time, probably still,
I don't know, looking for alternative implementations.
They wanted to be more of a spec
instead of a single implementation.
And at the same time, the reference implementation
was not performing enough for some use cases.
Or so I was told.
I don't claim to have any knowledge over
IPFS.
Or its history, really.
But this is sort of what I learned,
I guess, from that.
But the performance part is important because that
is basically why IRO eventually, after a while,
decided not to chase becoming another IPFS implementation.
And then IRO spent a while trying to
figure out what IRO actually is.
I think that journey started as something like,
how do we do content-addressed transfers that
are fast and better?
And that's kind of, I say better because
it's very vaguely.
Because that's where the content-addressed kind of
data thing was one of the main things
from IPFS, I guess, that people want from
IPFS.
So that was the first thing you sort
of started randomly experimenting what could that look
like in a world where you didn't have
to carry a lot of data around.
And that's where the Blake 3 streaming kind
of came from.
Just implement one hash function and it gives
us a very much simpler world for transferring
content-addressed data between two pieces.
And what was this called?
The hash function is like Blake 3.
Blake 3.
Yeah.
Have you heard of MD5 or SHA1 or
that kind of hash functions?
Honestly, no.
All right.
Don't worry.
Where are they used?
Do you know hash functions at all?
Yeah.
You take whatever kind of file and you
can reduce it to, I don't know, in
Blake 3, I think it's 32 bytes.
How many bytes?
Anyway, some sort of short string or whatever.
And whenever you change any single thing in
the original data, the short string is supposed
to change, right?
Yeah.
And there's this cryptography, so there are a
whole bunch of algorithms for this.
Blake 3 is a more modern version.
MD5 is probably the oldest, very well-known
version, but by now also insecure because they
know how to make collisions.
So a quick note here.
Hash functions are used for storing and specifically
retrieving data.
And a hash collision is when a certain
set of data has the same hash as
another certain set of data.
And that causes a collision because if you
then fetch for this hash 02, you would
potentially then receive two different data sets.
And notice over.
Blake 3 is a modern version and it
has some very nice properties that allows us
to basically, as we receive a file, we
can continuously know that the peer is actually
still sending the right data, which is sometimes
a problem in a naive implementation from like
a decade ago, say, where you could be
transferring a gigabyte and then at the end
discover actually no, they sent the wrong thing.
So that was the first kind of experiment
of IRO, I guess, of like, where does
it go if we abandon IPFS?
And then the second part was like, how
do we do hole punching if we don't
have to do any backwards compatibility?
And there we ended up basically going like,
well, let's converge towards, let's build everything on
top of QUIC because QUIC is a sort
of the transport protocol that's been built for
HTTP3.
And it has some very nice properties for
hole punching, very studied, very standardized protocol.
I mean, it's quite young as well, I
guess, but it's widely, it's getting widely adopted
and has some nice properties.
A lot of the, like, Google Cloudflare, Meta,
et cetera, are starting to use this and
at least in the, like, the connections to
end user devices, not so much in, like,
connections inside data centers, et cetera, it's not
that interesting.
But for us, it is also very good
because it's built on top of UDP and
that is a lot nicer to hole punch
than TCP.
It makes lives a bit simpler.
Those were the two main things that we
started with.
And for a while, we had, like, IRO
was like this combination of the hole punching
stuff and the blob transfer and then I
think some other, probably gossip, et cetera, was
added at some point.
I don't know exactly the ordering anymore.
But that was also confusing to people.
People didn't know what IRO wanted to be,
what, whether they would, they could rely on
just the connectivity part or whether we would
just take that away again because we had
our own things higher up the stack competing
with that.
And that's sort of where eventually we ended
up on, well, maybe we should just make
IRO just the base, the connectivity because there
are so many great P2P projects already that
that can be adapted to work with good
connectivity and, like, solving the connectivity is a
huge thing in its own.
And we didn't, like, yeah, this way we
can be a base for everyone who wants
to build, I don't know, different gossips, different
synchronization, different anything.
And yeah, that's sort of where we are.
Okay, now I'm mostly asking out of curiosity
because I've been having this conversation with, like,
among other people like Alyosha.
How did you arrive at that conclusion?
To just call IRO the base connectivity?
Yeah.
From speaking to people of yeah, mostly speaking
to other people who are, like, trying to
use IRO.
We saw partially the adoption that we saw
was like a lot of it was interested
in parts and almost always the hole punching
part was part of that, but not necessarily
all the other parts.
And then also we talked to people that
didn't use us and the main feedback there
we got that we thought, you know, could
be kind of interesting users.
And the main feedback there was that they
were just confused and worried about what IRO
wanted to be.
And the worry basically being that, well, can
we actually rely on this thing staying at
the base connectivity there?
So it sounds like you've been in pretty
close contact and communication with other end users.
I'll stick to the term users now, even
though in my mind, I just get confused
with end users.
Developers if you want.
What term do you like?
Developers?
No, developers is great because then I won't
mix it up.
I talk to a lot of UX testers
and stuff as well.
Which is completely different.
Exactly.
But OK, developers.
So you were talking a lot to the
developers, other developers who are using IRO.
And you realize this was the most important.
And I was going somewhere with this.
So moving forward, like number one, how are
you organized?
Because like that's always a question when it
comes to peer to peer projects.
Like in Scuttlebutt, we were not organized, actively
not organized in a structured way.
So it was very much like social networks,
navigating, rotating roles, dissolving our structures, stuff like
that.
But then other people work with companies, other
people have non-profit foundation, open collectives.
How do you do it?
IRO is a project run by a company
called Number Zero.
And it was set up essentially as a
by two people that came from the IPFS
world.
There was funding from IPFS at some point
or still.
I'm not entirely sure.
Don't ask me for details on that.
But the founders basically funded Number Zero.
And so it is a company and it
is almost entirely like developed inside the company.
But IRO and all the protocols are fully
open source.
So it's a very central structure, I guess.
But it's a small company.
There's nine people, I think.
Yeah, there's nine of us.
Two questions.
What's the license and what's the revenue model?
The license is, I believe, MIT or Apache,
too.
And that's the license we use on everything.
The revenue model is something we're still looking
for.
So far, we have done essentially custom consulting
for people that want advice on how do
I make connections on the Internet and how
these things work.
But we're also trying to figure out how
to sort of make that less consultancy.
So we have more time for actually working
on IRO.
And as part of that, we're basically looking
towards a trend we have noticed that is
a lot of our developers want to end
up running.
You can say users.
I get it.
I also get that it's confusing for you
because then developers become your own team, right?
Yeah.
But it's fine.
A lot of your developers or users?
Yeah, a lot of our users, I guess,
end up running.
OK, so this runs into how IRO technically
works partially as well.
So we want to OK, I'll go a
slight sidetrack on that.
Technically, like I said, like we want 100
% reliable connections, hole-punched connections.
And that's not possible, right?
Anyone in peer-to-peer knows that.
So one of the key things I did
was like kind of be pragmatic about this
and to establish direct connections, you already need
some information from outside, usually provided by servers.
This is traditionally like stern services.
So the approach IRO takes is basically you
have relay servers that we call them relay
servers.
They do slightly more.
So basically the idea is when you are
waiting, when you're ready and waiting to accept
connections, you always register with a relay server
and you somehow advertise this as your home
relay server.
So you can have many relay servers that
don't know about each other and it all
still works together.
So you just advertise like I'm on this
home relay.
If you want to reach me, you can
find me via this home relay server.
As soon as you start a connection to
this, we will initially send the traffic via
the relay server, which is obviously not ideal.
At the same time, the relay server will
do all the work required to help you
hole punch and will also coordinate the hole
punching.
And then as soon as you manage to
successfully hole punch, the traffic is moved to
the direct connection and the relay server does
nothing.
So that's sort of required to go back
to how we hope to make money.
Like a lot of our users are deploying
their own relay service.
Currently we basically work on offering that as
a service, essentially.
So people can have their own private relay
service, but we will operate them and hopefully
that helps everyone a little bit.
Cool.
So you'll do, so it's a multi-pronged
like self-sustaining economic model.
Both like supporting people with hosting the relay
servers and supporting people with like through consultancy.
But as you were talking, I realized that
you were really practically starting to answer the
what question from the podcast and it was
so good.
So I was wondering if we can continue
a little bit on that note.
And I'll do a follow-up question in
relation to when I worked on Scuttle, but
we did a project called Rooms 2, which
was an extension and that was also a
hole-punching project, which of course, as you
said, was not possible, so then we had
to do these relay servers, which we called
Rooms.
And one of the things that we were
very careful with while doing this was that
the relay server wouldn't be able to store
any of the data, so it doesn't become
like an accidental middleman.
How do you solve that?
Yeah.
Very good question.
So this goes, I guess, into well, yeah.
So the first part towards that is that
every end point, the end point for us
is node end point one.
It either accepts or creates a connection.
Every end point has a node ID, what
we call it, which is essentially a public
-private key pair.
And the public part is the node ID.
In order to