Download: MP3 - 01:35:49
Daniel: 00:00:00 Hello, welcome to the REPL, a podcast diving into Clojure programs and libraries. This week, I’m talking about Clojure with Zach Tellman, the creator of Manifold and Elements of Clojure, a recent book about Clojure. Welcome to the show, Zach.
Zach: 00:00:13 Thanks for having me.
Daniel: 00:00:14 Yeah, it’s great to have you on. So, I think that in preparation for this interview, I was thinking a little bit about people in the Clojure community and your impact on Clojure. And I would say that you’d probably be the top 10 at least, if not top five people who’ve had impact on Clojure programmers. And that’s, I guess, most Clojure programs running today would have some of your code somewhere in it. Do you think that would be a fair assessment?
Zach: 00:00:42 I think so. I’ve never been sure if people use my libraries because they’re the right libraries to use, or because they think they’re kind of near, or because the name is just more memorable than the other library. Yeah, I think that they’ve propagated pretty far into the community at this point.
Daniel: 00:01:00 Yeah. And so, you’ve been working with Clojure for 10 years, maybe more by now?
Zach: 00:01:06 Yeah, that sounds about right. I started in … It would’ve been late 2008, early 2009 I think.
Daniel: 00:01:14 Right, cool. And so, you’ve kind of covered off your intro to Clojure pretty well in talks in other interviews, so I don’t want to kind of rehash your whole Clojure origin story. But I guess maybe would be able to just sort of give us a brief overview of why Clojure, how you got here, and then maybe start with some of the original libraries that you worked on?
Zach: 00:01:36 I started using Clojure during my first job, where I was working with C# to write desktop software for Windows. I realized a few years in that this was not something I wanted to be doing for the rest of my career. And so, I started looking at alternative languages. I was looking at Ruby because I was in SoMa in San Francisco. GitHub had just been founded like down the street, it was very much in the air. I looked at Ocaml, I looked at Erlang. I looked at Clojure, and Clojure was because I was working at the time with Tom Faulhaber, who wrote clojure.pprint among other things. And he was a fan of Lisp from back in the day with Common Lisp. And he really thought Clojure was something worth looking at.
Zach: 00:02:24 And the absolutely absurd sort to test I did for each of these languages was, I tried to write something with Open Gl. Because in school I had focused on graphics and computational geometry, and I felt like I kind of missed that and wanted to get back into that. And so, I played around with Ruby. I played around with Ocaml, both of those bindings weren’t very good. Erlang didn’t even have them, so, that was a non started. But, Clojure, when I started playing around with it, I was able to just take LWJGL which is the Lightweight Java Gaming Library, which provides extremely literal bindings over the Open Gl speck. Literally it just has a bunch of static classes which correspond to the different Open Gl versions. So, there’s GL-01, GL-11, GL-12 and you have to import with the static methods from the correct one. It’s actually really tedious.
Zach: 00:03:21 But as I was learning Clojure and trying to go and learn how this library worked at the same time, I found that there is this really interesting semantic compression I was getting. I could go and I could say, “Actually, I don’t care what class this is in. I’ll just use macro time reflection to figure out which of the classes this should be. Because I know there’s something named this somewhere, so just find it.” And also Open Gl has a lot of scoped operators where you have to explicitly enter and exit some sort of scope. And of course with Macros it’s very easy to go and just say, “Enter this at the top, exit this at the bottom within some sort of try-finally.” And so, It was actually weirdly a very good way to get familiar with the benefits of using Clojure. At least as a way of interfacing with the Java library ecosystem.
Zach: 00:04:11 And so, based on that I created my very first Open source library on any language, which was called Penumbra which was a wrapper for Open Gl. And it was actually a wrapper for a older version of Open Gl which is called Immediate Mode, where you go and you …. for each frame, you make a call for each vertex that you want to draw. And this is very inefficient and not used by any serious game engine anywhere. But it is a very easy, fun way to go and experiment with it. That’s what I did. I just played around with it and came up with a little graphical demos and figured out how that should work with Clojure.
Zach: 00:04:48 And towards the end of it I was actually trying to create something that transfiled Clojure into GLSL with a GL shading language which is the code that executes on the GPU. To do this you had to type inference and stuff because GLSL is effectively like C99 with like a few extra operators. It worked-ish, but basically I was the only one who understood what sorts of programs would properly transfile and which ones wouldn’t. And so, that also became my first introduction to the fact that, if you write enough Clojures and you build enough Macros and enough compile time logic, it becomes an opaque tool for anyone but yourself. So, hat was a fairly fully featured introduction to the good and bad parts of Lisps and Clojure more specifically.
Daniel: 00:05:41 Nice, and I think probably a feature of your work would be your Macros. You’ve written a lot of Macros, your code is Macro heavy, sounds like it’s a negative thing, which I’m not saying it is, but-
Zach: 00:05:52 I mean it might be. I’m willing to accept that if that’s how you’re going to put it, so.
Daniel: 00:05:59 No, I just think it’s dimly a feature of your work and you’ve probably written a lot more open source Macros that many people have I would guess. And probably have a mature take on Macros by now, I would imagine.
Zach: 00:06:11 I don’t know. I think that the way that I try to approach things, the way I try to approach learning things specifically, is to try to figure out where things break down. What’s the boundary of this thing? Where does it become this absurd thing as opposed to a useful application of some concept? A lot of my open source libraries like ones that I’ve actually released and ones that just never really quite made it off the ground, were me trying to understand like, “Where is this sensible and where is this me doing this for the sake of doing it?”
Zach: 00:06:48 And a great example of that is my catch all utility library called Potemkin, which is actually I think the second library I built or rather released. Because I had this idea for how name spaces should work in Clojure. Because, Again, Open Gl has this huge surface area to cover and so I wanted to be able to have a lot of these operators exposed in some places for my own use. And then I wanted to take a sub-set of those and lift them up into a different name space for public consumption. And so, created this Macro called import-vars, which I thought at the time was just insanely clever idea. I was very proud of it.
Zach: 00:07:30 But I think that it also spoke to a problem I was seeing which was that, Clojure doesn’t really have a out-of-the-box opinion as to how you should structure namespaces. The only limitation of what goes in the namespace is, you can’t have two vars that have the same name. And if you take that to its logical conclusion, you basically get Clojure core, which is thousands of vars, none of which collide with each other. But there’s no relationship between them other than the fact that they are just built ins to the language. And if it were something that were actively developed, I would think that most developers, maybe not Rich, but most developers would find that very ungainly and very hard to navigate. And then I want to go and put in like seek related functions in its own namespace and special formulated stuff into their own namespace and everything.
Zach: 00:08:17 And then just be able to say, “Actually, all of these should get imported and surfaced into this Clojure core thing.” And so, it’s sort of decoupling how your code is organized for your purposes, and how your code is exposed to the consumers of your code. Having those be separate seemed good to me at the moment, and I still think it actually seems pretty valid as I explain it right now. But the rest of the community did not agree. In fact, I think that this was like the first time that someone just like expressed a general, “Ew,” sort of like, “That’s a gross thing you just made there,” sort of reaction, which is not the last time that that’s happened certainly. But, it was the first time that someone just had a very strong negative aesthetic reaction to this idea that I had.
Zach: 00:09:01 In honesty that was actually really interesting to me and motivated me to do this further because I had this question in my head of like, what is good design? Software design has always been something that has really interested me, because it seems like there is a difference between something that is good and something that is bad. Certainly in day to day conversation when we’re collaborating a code, people have these aesthetic reactions. But to really understand, to predict how people will respond to this thing is hard. And so, being able to have this test bed, which was the Clojure community and be able to put something out there and say, “What do you think? Do you hate it? Do you love it?” And just see how people respond to it, what parts people sort of take and run with, what parts they’re just confused by, was actually genuinely exciting to me. I felt this was a way to answer these questions much more directly than just writing some code at work.
Zach: 00:09:54 That was really what drove me to go and build more open source libraries, was the fact that I could get feedback, sometimes explicit and sometimes implicit through just people choosing to use or not use the thing that I have built. Of course, it’s not objective, because once you become established in the community, people use it not because they’ve carefully considered all the alternatives or something like that. They use it because there’s a brand associated with that or whatever. It’s more complex than I think I’m making it out to be, but still I think that it was an opportunity for me to learn about software design much more quickly than I would just if I were heads down coding through the work day and letting it go at the end of the day.
Daniel: 00:10:34 Nice. I’ve heard lots of people talk about why they contribute to open source, and why they create open source libraries. But I’ve never heard anyone talk about that aspect of understanding good design, at least not as clearly as you have.
Zach: 00:10:46 Well, I think that people are motivated to do open source for a wide variety of reasons. I mean this will, I assume come up later in the conversation, but this is something that I don’t think that if you would have asked me when I started doing this like, “Why are you doing this?” I would have had as articulate an answer to that question. I think that at the time it was just weirdly compelling for reasons I couldn’t quite say. In the same way that Clojure as a language is weirdly compelling to me for reasons I couldn’t quite say. The answer I would give when I was just starting out and I was telling people about this cool, new language that I was using, they would say, “Well, cool, pitch me on it.” The best I could come up with was just, “It fits my brain, it fits the way that I think and maybe it will fit yours too if you check it out.” It’s far from the most winning elevator pitch I think, but it’s hard. It’s hard to, I think, be really clear about, “Why am I having anesthetic reaction to this thing?”
Zach: 00:11:42 It’s undeniable that I was and that other people have had this reaction to Clojure. To really break it down I think is a much more complex process, I’m not even quite sure that I’ve fully done it at this point.
Daniel: 00:11:55 Yeah, maybe … diving back into your timeline there, that after Potemkin the other long running Clojure library, that I think many people will be familiar with is Aleph which is a … Would you still call it a Netty wrapper? Wrapper sounds quite diminutive.
Zach: 00:12:12 It’s interesting, so, to start from the beginning, Aleph started in … I want to say 2010.
Daniel: 00:12:19 Yeah.
Zach: 00:12:20 I believe around July. I remember because I wrote it over a long like 4th of July weekend, that was when that happened. The impetus for that was that, I had gone to a Clojure meetup and people were talking idly about what would an async ring look like? And it’s important to remember that in 2010, the new hot news was Node.js. This had just come out I think less than a year prior. It was taking the world by storm, everyone was really excited to async all the things.
Zach: 00:12:54 And I think that there was a sense that Clojure as another newcomer on the stage needed to have an answer to Node.js. What was our community’s thing that was going to be able to tap into the same excitement and be able to use it to grow our community as well. I didn’t have very good answer to it, and it’s worth remembering at this point, I was doing front end … or really I guess desktop development, and my background was in graphics. I hadn’t done systems development or any real sort before. But it seemed like an interesting problem, and other people that I was talking to there who were more experienced of this problem than I was, felt it was difficult and hard to navigate. And so, I thought I’d just play around with it a little bit.
Zach: 00:13:41 And so, I found Netty, which was like the Java async option. And I wrote just enough code in Clojure to expose enough Netty that you could stand up in http server. That was over the course of a couple of days, and I tested. I curled it once to make sure that it were to return “Hello World!”, that was the extent of my testing. And then I just was like, “Hey, here’s the thing.” I think I posted it on the Clojure mailing list. I haven’t looked at this announcement for a while, but I think I was pretty clear. This is just like me playing around with what does async Clojure look like?
Zach: 00:14:16 And someone posted on the Hacker News that David Nolan ran a benchmark, which I had not bothered to do up until that point and said, “It’s faster than Node.js,” which is a total not apples or oranges comparison for tons of reasons. For instance Node.js is single threaded, and he was on an eight core machine or something like that. It was an absurd comparison. But both the announcement and the benchmark made it to the top of Hacker News for a day. I had my little moment in the sun, and it was absurd on some level because literally I had just written a “Hello, World!” demo of how one could interact with Netty and Clojure.
Zach: 00:14:57 But what that did prove to me is that there was an interest in that space, and a much more avid interest than there was in Open Gl, which is where I’d been putting all of my time and effort up until that point. And so, I thought, “Well, if people like this and people are interested in this, maybe I ought to think about this some more.” And so, I started to tinker with it and think about, “Okay, what are the right ways to deal with asynchrony?” and other sorts of things like that. And at some point all of those questions, which were largely orthogonal to Netty specifically got pulled out into a library called Lamina, which was dealing with kind of data flow streams. Now, it wasn’t a queuing library because none of the things there had back pressure or like any of the things you’d assume that a queuing library ought to have. It left that as a, “At your end you should be paying attention to when things come out the other end.” And, “If there’s too much then stop sending stuff in.”
Zach: 00:15:58 And that just was reflective of, again, my lack of experience there. These are not things that I realized were important to have. On the basis of that, on the basis of me just really brazenly trying to solve problems, I had no business or experience trying to solve, people gave me a lot attention. And I got a job offer out of that to go work on Clojure full time. And so, that was very beneficial to me and I think that like also that is very reflective of how I’ve treated a lot of open source libraries, which is a chance for me to go and learn about something I didn’t know very much about before with the idea that if I’m doing it in public, it’s going to be extremely embarrassing if I get it wrong, so, I better not get it wrong. So, I better think about it pretty hard and put the time in to make sure that it’s not at least embarrassingly wrong, which again sometimes it is. That’s a lot of the motivation for me, is that I feel like I learn better in public, I guess, or learn more quickly at least.
Daniel: 00:16:59 You’re working in public, but also what are your thoughts, how do you feel about working with other people? Not just showing your work, but also contributing with others or having others contribute?
Zach: 00:17:10 I mean I’ve done a little bit of it, but I have to confess that a lot of what I’ve worked on … I mean, certainly Aleph by now is a collaborative project. At this point I’m getting a lot of contributions from Alexei [inaudible 00:17:25] and he is, at this point, basically maintainer and all but name of that library. And I’ve been talking to him a little bit about whether or not he would like to make that a little bit more formal.
Zach: 00:17:34 For some of the other ones that I’ve worked on, I think that occasionally someone will just come in with a PR where it’s clear that they just must have spent days digging into the innards of something and come up with the exact two line change that needs to go and fix the problem. I’m always incredibly surprised and impressed when somebody does that. But whether it’s just kind of like the code seems a little bit weird than what people are used to or they just don’t feel like they’re up to the challenge of understanding it … I don’t know what. I have not successfully created many projects that people are comfortable going in, contributing to. I think that Aleph is the one that is exception to that rule, basically.
Daniel: 00:18:16 Right, I guess maybe following on from Aleph and Lamina, another asynchronous streaming library would be Manifold, which looks like some of the things you learned from Lamina.
Zach: 00:18:28 Yeah, basically. It was a chance to do a clean sweep. So, the exact order of operations here is I wrote Lamina, Lamina was a kitchen sink for all the ideas I had about asynchronous everything. Had a ton of Macros. Had a ton of really complex stuff in there. Then core.async came out, and core.async was a different overall approach. But the thing that it really had over Lamina is that it was incredibly simple, which is not to say the implementation was simple, but the API that it had come up with was very direct. It had a handful of operators you had to learn. It had a couple of very big caveats in terms of the way it did [inaudible 00:19:06] writing in terms like not being able to enter into functions inside of a go routine which I think is still the case. But other than that I think it just was a smaller conceptual surface area for someone to have to learn.
Zach: 00:19:21 And I was impressed by that and I certainly wasn’t upset that someone had not taken Lamina and just being like, “This is clearly the way to go.” Because it’s just a ridiculously big sprawling mess. But I had concerns when I looked at it that they were thinking like, “Oh, well, this is just how Clojure’s going to do a asynchronous stuff from now.” Because it had a very tight coupling between the way that it dealt with an event that hadn’t occurred yet or data that we haven’t received over a channel yet and the execution model. Like when does the code that consumes those things run? Notably it had a fixed size thread pull that all that stuff had to run on. And that seemed like a reasonable decision you could make if you were writing an application, but I think a very limiting choice to make if you’re writing a library. Because a library doesn’t go and get to dictate what the execution model of the code that is consuming that library ought to run on. I think that that’s not the right sort of separation of concerns there.
Zach: 00:20:21 And beyond that, I think that core.async is an entirely separate way of thinking about lazy or eventual consumption of data which doesn’t play nicely necessarily with, for instance, [inaudible 00:20:32] or with Java Queues or with a bunch of other sorts of things that are all playing in the same space, all are mutually incompatible with each other. And so, my thought was, “Let’s go and take the intersection at the center of this Venn diagram of all these things and be something that can go and bridge the gaps between all of them, can convey data between them. And also provides something that is a reasonable, unopinionated set of abstractions that you could use in a library because it’s very easy to go and turn that from the manifold representation. And a manifold is just a thing that goes … like sits between a bunch of pipes or conduits or something like this and connects them to each other. It’s just the neutral party there. It’s Switzerland in the asynchronous territories.
Zach: 00:21:18 And that was the motivating factor. It was also just that I felt like Lamina was something where I’d made so many mistakes that I needed to go and just start over. But that was the idea. And so, I wrote that and then I rewrote Aleph on top of that. I think that core.async still is a much more widely used library in terms of the Clojure ecosystem. But Manifold, I think, has a smaller group of fairly avid fans. And I think that people will occasionally reach out to let me know that they’ve used it in one way or another often on a fairly central piece of their infrastructure. And that’s always really gratifying to hear.
Daniel: 00:21:54 Yeah. I remember when core.async came out and for a few years afterwards many libraries would provide, if there was an asynchronous API it would be a core.async API. And that’s maybe … I’m not sure if maybe I’m just paying less attention or it no longer surprises me anymore. But I don’t feel like I see that so much anymore that people are doing less asynchronous stuff maybe, just because it’s already been written or they delegate. It just seems to be less common that core.async is the API for new libraries.
Zach: 00:22:30 I think that’s true. And you could ascribe a lot of reasons to that. I think one of which is just that asynchronous is less cool than it used to be. And so, having that be a necessary component of your API is no longer seen as a requirement. I also think that core.async just hasn’t seen a lot of uptake on the server side of things. There are absolutely counter examples of that. But I think where core.async has seen a lot of use and I think provides the most value is in Clojure script. Like in the front end.
Daniel: 00:23:29 Yeah. That’s an interesting point to end, so we considered the Clojure side of it so much. I’ve done a lot of work with re-frame, which doesn’t tend to use core.async so much, it has it’s own queuing model and asynchronous execution. But I know certainly many other Clojure script applications that don’t use re-frame and probably some that do use reframe, use quite a bit of core.async.
Zach: 00:23:51 And I think most of the wrappers for making an hv-call or doing WebSocket communication, whatever, they al use core.async because that is a reasonable way to go and expose that in that ecosystem I think.
Daniel: 00:24:04 Yeah. So, there’s other smaller libraries you’ve written. One that I’ve come back to … I’ve used it over the years and still use it today is byte streams, which is just a very useful thing. Especially when you don’t necessarily … I know it’s fast enough that it’s not a core performance tool by any means, but especially when you don’t really care about the transformations and you just want it … so, for people who are not aware, byte streams is a utility knife for byte representations. Is that the tagline?
Zach: 00:24:39 I called it a Rosetta Stone for byte representation. The ideas that … there are many things in Java or in Clojure that represent a collection of ordered bytes. So, a byte array is the most obvious, but a byte buffer is one that got introduced in Java 1.5 and is weirdly incompatible in some ways, or some APIs won’t accept one versus the other. And then you have strings and character sequences which are clearly bytes with some additional meta data atop them. But you want to be able to convert from one to the other. And then when you start getting into Clojure specifically, you have things where it’s like, “Well, what if it’s a sequence of byte containers? What if it’s a sequence of strings? What if it’s a core.async channel of strings? Or a manifold stream of strings or byte arrays or what have you?”
Zach: 00:25:35 And all of these are isomorphic to each other in that they contain the same core information but all of the APIs expect them to look like a very particular type of representation. There is nothing that will go and just take whatever you give it and find a way to go and make it into what it needs.
Zach: 00:25:56 And in fairness, that’s not what you really want in an API. An API should be strict in terms of what it accepts. Because otherwise the performance characteristics there are unknowable. But you as the application writer, as the person who’s gluing together these strict APIs, you don’t want to think overly much about how to convert this. So, the idea was that I would come up with a bunch of these little piecewise conversions. Like, how do you turn a sequence of byte buffers into a byte buffer? How do you turn a byte buffer into an array? How do you turn an array into a string.
Zach: 00:26:27 And so, if you go and give it something which is a sequence of byte buffers and say, “I’d like this to be a string with a UTF-8 encoding,” it’ll go and just compose together the stepwise transformations and poof, you have a string. And because it’s a graph of type conversions and each of them has a cost associated with it. Like, “How much copying of memory are we doing here?” It can find the minimal path.
Zach: 00:26:50 So, and then once it finds the minimal path between point A and B it’ll [inaudible 00:26:55] that so that it’s not having to go and do that search repeatedly. And so, there are some constants here. Certainly, there’s overhead of the initial graph traversal. There’s the overhead of the [inaudible 00:27:07] functionals, all that sort of stuff. And so, if you just really care about performance this is not what you should be using. But if all you really want to do is just take data that’s in some shape and turn it to data that’s some other shape without thinking about it too much, then it’s a very useful tool.
Zach: 00:27:21 And yeah, I think that that’s a very helpful piece and is used extensively inside Aleph to turn from Netty’s own peculiar byte containers into other sorts of things. And the nice thing about this is that it is an extensible graph. You can go and create an edge between the existing graph and some other representation you might come up with and now you get that transitive transformability into all these other things for free.
Daniel: 00:27:44 Nice. Like you said, a more conventional way that this might have been written in Clojure lang would be to use perhaps multimethods or some other implementation writing that [inaudible 00:27:57] and this to this is this transformation but that wouldn’t have been quite so extensible as what you’ve come up with with the graph.
Zach: 00:28:05 Right. And in fairness, I wrote little util name spaces that would do piecewise transformations like you described, a number of times before I finally broke down and tried to generalize this. Because I try not to turn to the most absurd way to go and solve the problem immediately. I try to keep myself a little bit honest there. But it is, I think if you’re doing systems programming in Clojure, it just keeps on coming up. It just keeps on coming up that you have to go and do this because you’re getting bytes over the wire but they’re actually like [inaudible 00:28:37] so, you have to go and do all these other sorts of things.
Zach: 00:28:39 And either you just create this memory palace that has all of the conversions just sitting in it or you create this ever increasingly large utility name space or you just try to create something which is an extensible version of that utility name space. And so yeah, I think that that’s a library that I still get a lot of use out of. And so, I think that that’s probably one of my more successful open source experiments.
Daniel: 00:29:06 Right. And the other thing that you’re pretty well known for is your work on data structures, high performance, functional, data structures. And you’ve worked on quite a few of them over the years and most recently with the …
Zach: 00:29:22 It’s bifurcan, I think is what you’re searching for.
Daniel: 00:29:24 Yes. Yes, yes. That’s the word. I didn’t know that was the pronunciation.
Zach: 00:29:28 yeah, so it’s actually … so, circling back to Aleph, I have two different libraries that are named for a Jorge Luis Borges story. He was this Argentine writer from the first half of the 20th century who was a librarian. But he wrote a lot of these little short stories and other essays about infinities. How things become absurd once they’ve hit their limit of infinity. And so, the Aleph is a story about a guy who discovers that if he walks into his wine cellar and stares just beneath the 12th step into the cellar, he sees a point from which he can see all points, which he calls the Aleph, because the Aleph is the notation for infinity.
Zach: 00:30:14 And obviously it’s a completely ridiculous premise, but he plays around with it. And he has a very playful tone in a lot of his stories. And the idea of a networking library being the point from which you can see all points seemed a propos at the time. And so, that’s where that came from. And bifurcan is from another story of his called The Garden of Forking Paths. Bifurcan means broadly, “It forks,” I guess, in Spanish. It bifurcates. That’s one about this branching narrative where there are many paths through the story that are being explored. Some people actually call it the first narrative or literary example of hypertext.
Zach: 00:31:00 I think there were actually a couple of people who have tried to go and rewrite the story as a hypertext navigable narrative. And the reason that I called it that was … so, Clojure, of course, was I think very much at the forefront of so-called persistent or immutable or functional data structures. Have your pick as to what you call them. I’ve settled on functional because immutable implies nothing can change. And persistent implies that it’s persisted to disk to a fairly large portion of the software community. So, I think functional is maybe the best thing, which is that I take a function, I return a new function. There’s a functional semantics associated with the API.
Zach: 00:31:43 And so, Clojure uses the terms persistent and transient to talk about data structures which do allow for this pure functional semantics versus this mutable functional semantics. Like you give it a data structure, it still returns a new data structure, but it reserves the right to go and mutate that data structure in the process. And the use of transient is a little bit of a weird one. Because if you go and look at the literature around data structures, they actually prefer ephemeral. Like persistent and ephemeral are antonyms to each other. I don’t know. I am extremely fussy about nomenclature. As people who have read my book may be aware.
Zach: 00:32:23 And neither of … the idea persistent versus ephemeral, these feel like things that you talk about, again, with storage devices. Like main memory is ephemeral memory. And it’s the sort of thing where it feels like the wrong analogy to me, basically. And so, the one that I settled on was this idea of thinking about the data flow. So, if we’re going and we have a data structure … typically where you use transient data structures is we have an empty data structure and we want to fill it with a bunch of stuff. So, we go and we take it and we take this empty data structure and we [inaudible 00:33:00] a value and then we [inaudible 00:33:01] a 1,000 more values.
Zach: 00:33:02 And each time we’re going and effectively discarding the previous version of that data structure. We don’t care about it anymore. We only care about the most recent. And in that case we have this linear data flow where each time you’re not holding on to the previous reference. You only care about the new one. And that previous value only exists to go and feed into these accumulated data structures that were building. In my mind that’s a linear chain of that data structure flowing through those method calls. In the cases where we actually want it to be, “Persistent,” where we want it to have true immutable semantics there, is where that chain, that linear chain, forks, where it bifurcates. Where now two people need to be able to own this data structure. And we don’t know what each of them is going to do with it.
Zach: 00:33:47 And so, my terminology, which is entirely of my own invention and I think this is a bad habit to not go and honor what the industry calls it or [inaudible 00:33:56] calls it, but in all those cases I think it is sufficiently niche and confusing that I could justify this, is I called it a linear data structure. Which is one that we assume is linear data flow we allow for mutation, and a forked data structure, which is one where there are multiple owners or presumed to be multiple owners. And therefore we need the more classic structural sharing and partial copying and all that other stuff.
Zach: 00:34:23 And so, bifurcan means forked or, “It forks.” And so, it seemed an appropriate name. Also, of course, all these data structures under the covers are trees. And so, it felt like it had a slight dual meaning, at least that I found amusing. And that’s really the ultimate measure whether I like a name is, “Does it amuse me?” So, that’s what I went with.
Daniel: 00:34:46 Yeah. I had to think a little bit about that, the linear name, was not immediately obviously to me. But yeah.
Zach: 00:34:53 And I think that you could very rightly quibble with that. But it’s something where I was writing it as … it is a Java library. It is aimed at Java programmers because I feel like Clojure has a lot of really interesting ideas and even though its core library is largely written in Java, those APIs were never meant for public consumption. And there are a few people who have gone and taken that and cleaned it up and changed the hashing inequalities semantics back to the standard Java variants and then just exposes as a library. There’s one called Paguro, I think … P-A-G-U-R-O, that does this.
Zach: 00:35:30 It’s fine but it’s a little weird. And it will seem weird to anyone who doesn’t understand the lineage of that code and understand like, “Oh, well that’s what it’s called in Clojure.” And so, the idea was, if we just wipe the slate clean, don’t worry about the conducts, because we’re trying to go and sell this to people who do not have this built-in communal understanding of, “Here’s why Clojure’s data structures are great, here’s what persistent means, here’s what transient means.” If you assume none of that, then I think that you can be a little more free with the terminology and hopefully linear and fork make certain amount of sense. But it’s entirely possible that it doesn’t or a more standard term would’ve been better in that case.
Daniel: 00:36:11 And so, this maybe isn’t necessarily the best measure of a data structure that’s very important, but the performance of these data structures is extremely competitive. With mutable Java often and it’s going to be a lot faster in many cases than Clojure’s built-in collections.
Zach: 00:36:28 Yeah. So, Clojure, unfortunately, suffers from equality semantics, which I’m not going to go and litigate whether those were the right choices. But the fact that it, for instance, says longs and big nums, that represent the same value are equal. Which is not true in Java. You can’t go and check that a 1 and 1N are the same thing. So, if you call .equals() on that, then that will return false. And in Clojure that would seem as a something that was worth preserving. In part, I think, because Clojure having auto overflow was seen as a huge value add for the language, which I think has not necessarily proven out. But, again, these are things that you expect to find in a Lisp, is a rich numeric stack.
Zach: 00:37:17 So, for that reason though, creating hashes and checking equality is significantly more expensive. And largely because it’s just large enough that it can’t be easily end lined. And so, doing simple things in Clojure, like adding a key to a map, which invokes all those equality semantics, just costs more. So, you compare it Java, you compare it to Scala, you compare it to any of those sorts of things and Clojure is just marginally but measurably slower.
Zach: 00:37:44 The additional thing that I do in bifurcan though that is probably cheating in the eyes of anyone else who’s libraries I’m comparing this with, is I say, “Well, we want to be able to switch between this mutable and immutable representation,” like this linear and forked. But there are cases where we never care about forking the data structure. And this is actually where immutable data structure is fine, where using a Java hashmap is fine, is it’s just local to some scope, you’re using it as a little accumulator. No one else will ever see it, no one else will ever write to it. Therefore, “Why are we bothering with immutability in the first place?” And so, I said, “I’m going to write variants of my data structures which share the same API but are just permanently mutable, or rather if you want to make them immutable it’ll create a little wrapper around it that will make it so you can’t write to it directly anymore and we’ll just keep track of which keys have been added and removed atop this base data structure.
Zach: 00:38:41 And this is legitimately cheating if you’re going and just saying, “Who’s written the highest performance tree based data structures?” But I think it speaks, if you were speaking to, “What are the actual workflows that people are using their data structures for?” There are a great deal that don’t require this kind of behavior at all. But we want to have the opportunity to, if we need to go and now take this data structure and pass it off to somebody else, to make it something which has that functionality. And so, by saying I can instantiate this map, this eye map, this generic map, with either something which is permanently mutable or flexibly mutable and having that not change all the downstream code, not change the implications, not change the semantics in a meaningful way, I think is useful. Or hopefully is useful.
Zach: 00:39:32 So, in that case, it’s pretty easy to be competitive with Java, because I’m just writing another mutable data structure. But it has the key difference here is that it has a functional API, one where you go and you pass in a collection and the thing you want to do to that collection, and it passes back a new collection. Or at least passes back a collection which might be the same thing. And so, you don’t have to think super hard about what are the semantics of this thing, except in the sort of, “Is it in this moment a mutable or immutable data structure?”
Daniel: 00:40:06 Right. Another data structure improvement change you worked on was the unrolled tuples. Both a library and in a patched Clojure, if I’m remembering correctly. So, that ultimately didn’t make it into Clojure, I wondered if you had any thoughts on that. Anything you wanted to talk about in relation to that?
Zach: 00:40:28 Yeah. This actually came out of some work I was doing on byte streams. Because in byte streams you’re going and you’re saying, “Hey, I want to look up what is the fastest path between this type and this type for conversions.” And it turns out that in Clojure, doing that lookup is in some cases as expensive as simply doing the conversion. Because some of that stuff is very optimized, like going and turning a string into an array of bytes or something like that. That takes about 100 nano seconds. And the lookup to find out how to do that also took 100 nano seconds.
Zach: 00:41:01 And so, I started looking at why that was and the reason was that I was going and I was doing a lookup where I was instantiating a vector of the from type and the to type. And then doing the lookup. And that was just slow because the tuples had to be instantiated in the way where it’s like, “I’m taking an arbitrarily sized vector and then adding two things.” And then going and calculating a hash on that was a little bit slower. And so, there’s a few things that were just small little losses of performance that were adding up to enough that now byte streams was a measurably slower way to go and do this conversion.
Zach: 00:41:35 And so, my bright idea for how to fix this was, “Well, if we know that it’s just going to be a two vector, this is going to only ever contain two things, why not create a special two vector? And for that matter, why not make a special one vector and zero vector and two and three and so on.” In my case, up to six, which was a fairly arbitrarily chosen thing but I just got sick of going and trying to deal with that stuff.
Zach: 00:41:59 And so, I first wrote this as a macro generated thing. And it worked pretty well. At least worked well for the use case I was coming up with, like that two tuple, or two vector lookup became measurably faster. And so, I talked about this at a conference and Rich was there and I was talking to him over lunch and I said, “Would you have any interest in putting this into Clojure?” And he said, “Yeah, sure. As long as you write it in Java.” Because Clojure data structures are written in Java, that’s just how it is. And I was like, “Okay. Well, I wrote this whole thing using Macros, I’ll get back to you on that. I don’t know if I … ” because even if I were to just go and take a agonizingly long day and type a bunch, I would probably make lots of little mistakes, copy paste errors, all the other stuff.
Zach: 00:42:48 And so, I let that hang there for a while. Probably eight or nine months. Until for a hackathon when I was working for Factual, I decided, “Let’s give this a shot.” And the way that I decided to do that was, “I’m going to write Java that generate … ” I’m sorry, “I’m going to write Clojure,” rather, “That generates the Java for this.” The way I did that is I basically took some code from Eclipse that did Java indentation. I used that as basically a syntax check. So, “I’m going to go and create a big blob of Java that has no new lines in it. And then I’m going to go and pass it into this formatter and if that’s correct, then I’m going to assume it’s reasonably well formed.” Maybe not semantically correct, but that’s something that you can test generatively. So, that’s pretty straightforward to go and do once you have the Java all written out and compiled.
Zach: 00:43:36 And so, it was a total hack. The hack that I feel really kind of pleasing in a perverse way. And so, I had that and then I circled back and I said, “Hey, I’ve got this. I’ve got thousands of lines of Java that I’ve generated. Do you want this in Clojure? Yes? No?” And the response was tentatively positive. Because anytime someone comes to you with a PR which is just enormous, you want to go and say, “Yeah. Okay. Well, maybe.” Certainly you don’t want to just get a, “Yes.”
Zach: 00:44:07 And I said, “Look, I just want to make sure that I’m spending time towards some productive end, so just let me know. I could also do the same thing for maps if you like.” Because we could have a map of one, map of two. And in fact, Clojure has two different types of maps. It has the hash map and the array map, where the array map is just a flat list that you linearly scan. There’s no actual attempt to go and hash locate anything. And for any map smaller than eight elements it will go and use that approach, because that deems to be a more efficient approach overall.
Zach: 00:44:43 And so, this had some prior art to it and so they said, “Yeah, sure. Go ahead. That’ll be helpful just for comparative purposes.” So, I wrote that. And this all took place over about 18 months. I was chipping away at this just whenever the mood took me. There was no one who was willing to commit on the other side to like, “Yes. Let’s go and test this.” But I did it, I wrote some benchmarks. And I’d been pushing on his for a little bit. And then at that point finally Rich entered the conversation. Because the contribution process is that there are some gate keepers, it’s Stu or Alex or whomever, are going and making sure that the PR is sufficiently vetted. At which point Rich will come in and consider it. In this case mostly for the first time.
Zach: 00:45:31 And so, he looked at it and he said, “That’s an awful lot of code. I think I can do this … I can go and get the same effect by doing less.” And so, he wrote up a much smaller thing that unrolled in a much less aggressive way and said, “If we’re going to do this, this is what I’m going to use.” And I was, I think at the time, pretty upset about that. Because it felt to me like if all I was doing was writing a proof of concept, why all of the attempts to go and polish this and make this a very complete and production ready PR … like if all it was just way, “Here’s a thing that Rich might want to write someday.”
Zach: 00:46:10 And I think that I still think that that was reasonable reason to be upset about this. And I think that this is something that people [inaudible 00:46:18] before. The reason that I can’t go and really hold a grudge about it is because once Rich ran it, and he put it into Clojure proper, which I had not done yet. I’d only used it for a couple of cases like this two tuple lookup. And then a couple of other tests that I had run on some code. But I hadn’t gone and taken a version of Clojure with these new data structures jammed in and seen what happened. He found that it wasn’t actually faster on the whole, because having seven different classes that implement vector make that actually less efficient in terms of dispatch. It’s what’s called megamorphic dispatch where Java can no longer do clever things in terms of being able to figure out which implementation it should go en route to when you go and call conj, for instance.
Zach: 00:47:00 And this is not something that I had tested in any way. And, to be fair, I’ve still not seen Rich’s benchmarks for any of that stuff.
Daniel: 00:47:09 Yeah, I was going to ask about that.
Zach: 00:47:11 Yeah. I have not. He just said, “This is slow. I think it’s because of megamorphic dispatch.” And that parses, that is a thing that I think is quite possible. I have no idea what he was testing on, I have no idea what his methodology was. It is genuinely something that I had no thought of [inaudible 00:47:27]. My enthusiasm had pushed me, I had not stopped to consider that side of things. And so, I’m happy to go and say that that was my bad. It was a less good idea than I thought it was. If it had turned out to be exactly as good an idea as I thought it was and then my implementation not made it into Clojure proper, I think I woudlve probably held a little bit of frustration there still because it’s a little bit weird to be trying to contribute and then finding that actually you’re just providing a general sketch of what will at one point be in the code. Because I think that there’s a pride that you derive from saying, “I like Clojure, I use Clojure.” Clojure is in part code that I’ve written.
Zach: 00:48:09 That last part is, at least to a certain person in the open source community, a really key part of what motivates them and makes them feel like Clojure’s ongoing success is something that they are very invested in. At the very least I feel like I’m one of those people and I’ve known other people who I think get frustrated with Clojure for the same reason. But I will say, very explicitly, in this case for the unrolled tuples, that is not something that I harbor any great frustration or resentment about. Because, turns out, it was not nearly as good of an idea as I thought it was, or at the very least, there’s a plausible reason for why it wasn’t.
Daniel: 00:48:45 Right. Yeah, that’s good. I don’t think I had all of that context all in one place and one conversation, I’d picked up different bits and pieces.
Zach: 00:48:54 Yeah. I don’t think … I never did a writeup of it. Possibly that was bad idea because I’ve seen people use that whole experience of an example of being like, “Here’s why the Cognitect contribution process is no good.” And again, there is an alternate version of this thing where that actually, I think, is a legitimate point to make. But in this case where it turns out that my idea was not well suited to be in the core language, [inaudible 00:49:20] only be good as a stand alone library because if you’re using it for something where you don’t have many different sizes of vectors or something like this. It is legitimately faster. But it is not well suited for the general purpose, Clojure implementation because of that.
Zach: 00:49:35 So, I think that it’s not something that people should use as the shining example of why people are getting frustrated with Clojure’s contribution process.
Daniel: 00:49:45 Sure. But one thing in that that … there’s probably a few things we could take from that process, one was I guess the expectations or communication about expectations where it seems like there was perhaps a mismatch of what you thought you were doing and what the likely outcome or response was going to be and then what it actually turned out to be. Those didn’t seem to be aligned.
Zach: 00:50:09 Yeah. Well … So, I’d seen … I’d been working with Kyle Kingsbury, who’s better known as Aphyr, around that same time. And he had gone through a similar process where keyword interning in Clojure for fairly slow for reasons that were not intrinsic to how keywords worked. But if you went and you tried to convert a string into a keyword it would take a long time to the point where converting or parsing, rather, JSON, where you wanted all these keys to be keyword-ized, the major computational cost there was just turning strings into keywords.
Zach: 00:50:45 And so, he went through a similar exercise where he came up with a big … or not even a very large PR, but a 20 change or something like that. Went through all the hoops in terms of demonstrating that this is indeed faster, there are no regressions, etc. And then in the end Rich took his PR and rewrote it. So, then Rich was like, “Well, thanks for the recommendations as to how I could go and I could fix this.” I, having seen that payout, I thought I was being very clever by checking in periodically saying, “You still want me to do this, right? This is still a thing that you want?”
Zach: 00:51:16 And I was assured along the way like, “Yes, yes. This is good. This is great.” What I assumed, I guess, was that when someone who was at Cognitect told me that that was on the basis of some sort of conversation they were having. That was a collective assurance as opposed to a personal assurance from Stu Holloway or something like that. And it turns out that it wasn’t. And looking back I can’t point to anything that made me reliably infer that this was Cognitect as an entity giving me this assurance. But when in fact, basically what it was is that someone was saying, “Yeah. I’m pretty sure Rich will like this when he takes the time to look at it.” And then Rich took the time to look at it and didn’t like it.
Zach: 00:51:55 And so, I think that the assumptions that I had going in were wrong. And I think that it’s interesting because there was a little bit more recent drama with Clojure, which we can talk about if you would like to. But basically I was going and voicing some of my frustrations. Which again, are not because my data structures didn’t make it into Clojure but because I see people who want to make Clojure something that they feel somewhat degree of ownership over are being turned away, basically. And from that they lose a lot of their motivation to continue to invest in the community and end up going elsewhere. Some of them more loudly than others.
Zach: 00:52:37 So, Chas Emerick, for instance, has largely vanished. He’s writing Haskell these days. And he wrote a book. He contributed a ton to the Clojure community and then one day he just stopped showing up. And I can’t speak for him and all of his reasons but I think that he has articulated to me that he’s definitely seen a shift in terms of how people were encouraged to go and help shape Clojure as a collaborative process versus this very top down autocratic process. And it’s undeniable that that has changes. The NS Macro in Clojure was not created by Rich. It was created by Steve Gilardi.
Zach: 00:53:18 Originally you were just encouraged to go and put a bunch of imports and requires and whatever as the prelude to your thing. There wasn’t a single NS Macro that did all of those things. And try to imagine someone today coming up with a different way to do namespace declarations in Clojure. Try to imagine someone going and saying, “I’ve got a great new idea for the ergonomics of Clojure.” It wouldn’t even make it off of the initial post. People would just be like, “Yeah. Sorry. This is never going to happen.” And, in fairness, there’s this concept in neurophysiology called plasticity, which is basically how quickly does your brain reshape itself in response to incoming stimuli? And children have extremely plastic brains. Adults have much less plasticity in their brain, which is probably good. Because when you’re a child you’re changing a lot. You’re going through all these things. You want to reach this level of maturity and stability. You don’t want to go and shake things up all the time just because you can.
Zach: 00:54:21 And so, I’m not saying, “Why aren’t we able to go and rewrite Clojure from release to release,” or something like that. But I think it’s fair to say that there has been a change and that there was a time when Clojure was a more collaborative process. And to pretend that it has never been that, which I think is sometimes a talking point that comes up, is false. To say that it shouldn’t be that is fair. And I think that is a defensible stance, though not necessarily one that I agree with. But some people say like, “It’s always just been Rich’s thing. And there’s never been external input. There’s never been meaningful changes to how the language is written by people who are not working for Cognitect or not Rich himself,” isn’t true. It’s just that that time where that was a reasonable expectation about how the language was maintained has passed.
Daniel: 00:55:11 Yeah. And I think either of those approaches are valid ones to take. But probably my frustration or my feelings about it was that the issue was that wasn’t explained particularly clearly this new model or this new intention. And maybe it wasn’t even consciously understood by Cognitect as they were doing it, it just was a natural shift. But it’s frustrating to see people new to Clojure get excited, come up with some ideas, see some possible improvements and then to hit the brick wall and just not necessarily understand why, what’s going on. They come to Clojure with, “Clojure’s an open source project.” And they have a bunch of assumptions about that works. And there was no documents being extremely clear, until recently, being extremely clear about, “No. This is a very different project. And we don’t work the same as other projects.”
Daniel: 00:56:08 And that’s, again, as we’ve [inaudible 00:56:10] in Rich’s most recent post, he had no obligation to explain himself, but it certainly would’ve saved a lot of time and energy and frustration on a lot of [inaudible 00:56:21].
Zach: 00:56:21 Certainly. And I should say, like you say, it hasn’t been written up anywhere, the only written record of Clojure’s contribution process, which approaches a honest, straightforward articulation, is in a gist on GitHub. And there’s a followup conversation in the comments of that gist. It’s not on Clojure.org, it’s not like … this is not something where I think it is discoverable by people who are coming to Clojure. So, I think that there’s still work that could be done there, unless I’m wrong and there has been some change in Clojure.org without me noticing.
Zach: 00:56:54 But I think that talking about it in terms of incompatible, unspoken assumptions is exactly right. And something that came up repeatedly was Evan Czaplicki, who’s the creator of the Elm language, gave a really great talk at Strange Loop last year called The Hard Part of Open Source. And in it he talked about, what’s hard about open source is not the technology, it’s not the technological decisions, it’s the people and navigating those conversations. And in that he brought up the, “By who’s authority?” Or, better known as the Clojure Post, which is the first sentence in that post. And it doesn’t get much better from there.
Zach: 00:57:36 And people talk about entitlement in open source, and I think it is undeniable a deeply entitled post. And it’s not one that I like, and it’s not one that I’m very happy with. Because I think it poisoned the well for having a more constructive and meaningful conversation where what’s being said by the community isn’t very easily dismissed as just more Clojure-y, basically. And that’s very frustrating to me. But I think that there is … a real point was being made in Evan’s talk which is not, “People shouldn’t be mean to open source creators.” I mean, that is a point that he makes, and there’s a point that people are doing. But it’s not just like, “You should shut up and be grateful.” What he’s saying is that people don’t state their assumptions when they go make an assertion that something is true or ought to be true.
Zach: 00:58:28 People are going and predicating what they’re saying. And they have very strong opinions. But what is left unstated is the assumption that goes and gives birth to that very strongly held view. And I actually talk about this a little bit in my book, not about open source stewardship, but like I say, if you say that software is over engineered, that’s not an intrinsic property of the software, it’s a property of where you expect that software to be used. Something which needs to … a piece of hardware that needs to go and survive cosmic rays, if it’s not going into space or some other place where that’s a problem, then yeah, it’s over engineered, it probably has more complexity or more costs than it needs to. But again, that’s not an intrinsic property of the thing, it’s a property of where we put the thing.
Zach: 00:59:11 And so, similarly, when we’re talking about, “What can we reasonably as a community expect from someone who is the creator and ongoing steward of a language, is not, I think, something that we can talk about from first principles. Or at least it’s not most interesting to talk about from first principles. Because the only first principle that’s really available is, “It’s his language, he gets to do what he wants.” Which is undeniable true. But what’s not discussed in that conversation is there are norms that exist in open source in terms of what is expected. If people come together and start working on a language and form a community around that language, and if there’s a company that’s formed around the stewardship of that language, the general expectation is, ongoing maintenance of that language is a first class concern.
Zach: 00:59:58 And that having the language reflect a plurality of perspectives and uses is valuable. Because that will allow the community to grow to allow the language to be used in ways that the creator never necessarily expected. It will flourish and go off in directions that no one could have predicted, basically. That assumption, which I think is reflected in many other successful languages, is not valid in Clojure. Again, this is not meant to be a value judgment where I’m saying, “It ought to be.” I think it was certainly surprising to me to find that it wasn’t.
Zach: 01:00:31 And I’m not trying to go and say that things should change necessarily. But I think that to go and pretend that people are being entitled just because they expect Clojure to be maintained the way that other major open source languages are maintained, is I think very odd and a little bit victim blame-y. Because I think that if you’re trying to reason about this from first principles, then the broader societal context around how open source works, maybe that doesn’t matter. But that’s not how people thing. And I don’t think that’s a good way to go and think about it.
Zach: 01:01:04 And so, I think that that’s why people feel frustrated, is that never is there a conversation about, “Here’s how open source normally works. We acknowledge that. And we want it to work differently because of these reasons that we give, or just because that’s how we want it to work.” And of that would be a huge improvement to the current thing, which is, “Clojure exists in a universe unto itself. We do not acknowledge other methodologies or other expectations that people might bring with them from other communities. It’s up to you to figure out how things work here. You should go and treat this as a blank slate.
Zach: 01:01:40 And people reason by analogy. People go and make inferences and they fill in what they don’t know with things they know from things that they believe to be similar, other projects that they’ve worked with. And so, it’s unsurprising when people are surprised. And it’s unsurprising when people are frustrated. And I think that it is reasonable for people to go and ask why this hasn’t been more clearly articulated. Rich is an extremely articulate, thoughtful person. And I have no doubt that he has thoughts on this. There have been conversations I’ve had with him, I had one at the most recent [inaudible 01:02:12]. I don’t really feel like it’s my place to go and characterize what he said to me, but these are not things that he’s said or tried to go and write down outside of the heat of the moment.
Zach: 01:02:21 And I think that that’s unfortunate. I think that it leads to far more acrimony than is at all necessary.
Daniel: 01:02:28 Yeah. I think I feel similar. And over the last year or so, as the [inaudible 01:02:36] became more and more frequent, culminating in Rich’s post, I think at least … I had people talk to me and say they felt hurt by that and that they didn’t feel respected or other things. But I definitely feel like it at least put … it brought clarity to a situation where there was little before and so people were free to superimpose their own views over how they thought the situation was working and then it was only months or years later when that didn’t align that they became frustrated.
Daniel: 01:03:09 So, at least I guess people at least now have a better idea of what to expect or what not to expect, which is something.
Zach: 01:03:16 Right. Though, again, I think that one of the things that I raised was that the community is not growing like it used to. And I have thoughts on why that might be. And I don’t think that they all relate to just Clojure has been mismanaged, I think that Clojure got an enormous boost from coming out at the height of the Paul Graham, Lisp mania. And I think that that was a lot of fertilizer from which it could grow, but I think that there was maybe an implicit assumption that all of that was down to Clojure and Clojure being intrinsically excellent or intrinsically well maintained or something like that. And that that was just a growth trajectory that we could expect to continue indefinitely. And it couldn’t. Once the hype dried up and I think that Clojure had to go and very much succeed or fail on its own merits there.
Zach: 01:04:07 And I think that it’s certainly continued to grow but much less rapidly than before. And so, maybe it matters less. Maybe the fact that the people who were around for the great Clojure debate of 2018 know now that this is how the language is maintained is good enough. But I still think it’s curious that there hasn’t been something which is just there to talk on Clojure.org is just, “Welcome to Clojure. Here’s how we think about language, design, questions. Here’s how we think about data and immutability. And here’s how we think about open source stewardship.” These things I think could all be together somewhere in a well, clearly articulated place. And that’s not the case right now. And I’m not wholly sure why.
Daniel: 01:04:56 Yeah. And going back to growth, I think that growth is another assumption that people have when they come to a language or a project that’s implicit in what they expect that’s like, “The creator wants us to grow,” and that, “Everybody involved wants us to continue growing and growing and growing.” And that’s a unstated, implicit good. And I’m not necessarily sure that’s a value that Clojure holds. I don’t think they don’t want growth but certainly growth is definitely not the number one priority. I think most people would probably agree with that.
Zach: 01:05:34 Yeah. I would agree with that as well. I think that Clojure’s growth is certainly not the primary concern. And it’s not clear to me that it ought to be. Because I think that you can go and you can easily make a case for a tool being niche, like useful to a certain problem or certain person who has a perspective on software. So, it’s not inherently bad to not be chasing growth at all costs. I think that the most worrisome thing, though, to me was in terms of the response Rich had to some of the criticisms that were coming from the community was that he talked about all the work that had been done on the error messages as work that had effectively come out of his pocket.
Zach: 01:06:19 He talked about he had paid his retirement into the initial development Clojure. He hasn’t made that back. And so, seeing community oriented improvements, things that don’t necessarily reflect his use of Clojure to say, write the atomic, things that are aimed at beginners rather than the experts that Cognitect employs, as a gift. As a thing that is not actually financially viable in and of itself. That’s deeply worrisome to me. Growth, even if it’s not a foremost concern should at the very least be a profit center for the people who are running Clojure. If work that is in the aid of growth is something that is costly and distracting and generally not aligned with the other motivations that they have, then I think the consequences of that are predictable. It’s pretty clear where that leads in the long term.
Zach: 01:07:24 With no visibility into any of this, like how well the atomic’s doing, what Rich’s finances are like … nor should I or anyone in the community expect to have that level of visibility. All I can say for sure is that it seems like the community is seen as a cost center. I don’t know how you would realign that. Because I don’t know how we got here. I don’t think it was clear to me until that post that that’s how it was seen. But that’s the part that worries me the most because that’s the part that makes me think that it’s going to be hard to do even slight course corrections. To the aid of having people who come here with these expectations based on how other open source are run and not have those be totally overturned.
Zach: 01:08:10 And it’s possible that growth can be a non-goal and Clojure will continue to grow despite that. Or at least it won’t be hampered by that. But it does mean that things that are in the purview of Cognitect alone, like error messages, which are hard to bolt on to language outside of the core, are probably going to be fairly slow to arrive. And it’s going to be contentious because it’s going to be seen as this great gift that’s being bestowed upon the community as opposed to just a thing which naturally one would do, because the community is the source of your continued consulting income or what have you.
Daniel: 01:08:52 Yeah. And it doesn’t seem clear to me that any core [inaudible 01:08:57] are coming either, that there’s going to be any difference or changes. Although I should point out Alex Miller’s weekly roundup of what he’s been working on with Clojure, I really enjoy, I think it’s really useful getting a bit of an insight into what’s being worked on. So, I wouldn’t say nothing has changed, certainly.
Zach: 01:09:16 No, and I want to be perfectly clear. Because I don’t think I was and I think that I was needlessly hurtful by painting with this broad brush. I think that there is individual work that’s being down, which is absolutely community oriented. With Alex at the forefront of that. And I think that is not something that should be taken for granted. I don’t think that it’s something which we are inherently owed and can be freely ignored. I frankly wish I had said this earlier in this conversation. But I’m worried about what are the structural incentives here. Because if the entirety of the community engagement is born from Alex Miller just being willing to do that in spite of everything, to make the case for that, if it’s something where a bad quarter at Cognitect might go and change whether or not he’s allowed to do that, that’s cause for concern I think.
Zach: 01:10:07 And to talk about that the structure is misaligned doesn’t impune the good intentions or motivations of any of the individuals that exist within that structure. But I think it’s hard sometimes to go and speak separately about the two. And so, I just want to say, if I’ve ever given someone who’s worked very hard on behalf of the Clojure community cause to think that I take what they’ve done for granted or don’t appreciate what they’ve done, that’s not true. But I don’t think that that’s by itself reason to not criticize the overall direction that these things are taking or the motivations that we can infer from what is said whenever there’s a flare up within the community.
Daniel: 01:10:54 So, another large project you’ve been working on more recently and finished just in the last few months, I think, was the Elements of Clojure, your book about … it’s not quite a style guide for Clojure, it’s a bit more than just that. So, if you want to talk a little bit about, what is it, why did you create it … ?
Zach: 01:11:12 Well, it did actually start out as, effectively, a style guide for Clojure. Yeah. So, I was, back in the day, doing meetups. And currently still running the Bay Area Clojure meetup. And when I was at Factual, we were hosting regular office hours, where we encouraged people to come in and just pair up. We weren’t going to try and have a lecture or anything like that, we were just going to go and make sure that people who knew about some facet of Clojure and people who wanted to learn about some facet of Clojure would be able to find each other and chat about that. Because I thought that that’s what happens in the margins of a typical meetup, like at the beginning, at the end. And to my mind that’s often the most valuable part of it, because you will have good talks and bad talks, but it’s very rare that a talk will be of interest to everyone who’s attending.
Zach: 01:12:03 And so, I thought, “Let’s just go and try to take that out and see if there’s still something worthwhile leftover.” And something that I was noticing a lot was that people were coming in who liked Clojure, had learned Clojure, were trying to advocate for Clojure being used inside their company, but they were extremely tentative. Because they were coming in as the advocate for Clojure, and therefore the presumptive expert on it. And they were extremely worried about being right in terms of how they talked about how one should use Clojure.
Zach: 01:12:32 Because I think that Clojure is peculiar, or at least somewhat more extreme in terms of the impetus that it puts on, being very thoughtful on your design and coming up with the right design. And so, if they were going and saying, “Well, this is how you structure name spaces,” and then it turns out that that’s not true because they hire someone who is an experienced Clojure developer and they come in and look at it and say, “Well, what on earth is this?” That was actually a meaningful impediment to them even advocating for Clojure being adopted at their company. Because they felt like they would have to go and take on a role of authority that they weren’t comfortable with.
Zach: 01:13:05 And I thought that that was a shame. And I thought that that was something that was clearly hampering growth and adoption. So, I thought, what if there was a book that would take the [inaudible 01:13:17] work out of it? Would say, “Here are some reasonable ways to go and approach the writing of Clojure. These are not the only ways. These are not the canonically right ways to do it. But they’re solid. They’re good enough. And if you go and you keep on writing code and eventually you hit the point where these guidelines are no longer serving you well, toss them out. You’ve outgrown them.”
Zach: 01:13:36 And so, I was just saying, “This is a good second book to read about Clojure.” Because it goes and says, “Here’s some norms that we can establish and some of them reflect what’s already happening in the community, some of them reflect what I personally, as an opinionated person, think ought to be happening in the community.” And then at the end of the day, there’s some people who think that I’m wrong and then a bunch of other people who ideally would hopefully just happily follow this stuff because it gets them out of their own head and stops them from just getting wrapped around the axil of, “But is it the right thing to do?”
Zach: 01:14:10 And so, that was the initial motivation for the book. And that’s why the book was originally called Elements of Clojure, because it was meant to be very much in the [inaudible 01:14:18] white style of, “Well, sure, it’s not right. But at least it’s a reasonable set of defaults to follow. But as a I started to write it I realized that my ambitions were somewhat deeper than I had originally realized. And just exacerbated the fact that after effectively writing the first chapter I quit my job to focus on it full time because I wanted to get this right.
Zach: 01:14:45 And what I realized was that I wasn’t just trying to go and say, “Here are some decent guidelines.” Because, when it comes down to it, with a style guide you can talk about the appearance of the code, but to talk about the conceptual layout of the code isn’t something you can write a style guide about. There’s not a right way to go and build interfaces, “Let’s just check off the boxes.” Because I, the author, don’t know the domain that the software’s interacting with as well as the person who’s reading it. And so, I can’t go and tell them what to do. All I can do is give them a framework that they can combine with their domain knowledge to come up with what they think is a reasonable answer.
Zach: 01:15:21 And so, creating a conceptual framework is, it turns out, a lot harder than coming up with a style guide. And so, that was what I fell into. And so, what was originally going to be, I was going to quit my job, I was going to spend three months finishing the book, and then I was going to go off and work on a whole bunch of other things. And it ended up being, I quit my job, I spent 16 months writing the book and then some amount of time going and finalizing the manuscript and everything like that. And throughout that I was releasing chapters and getting feedback from readers and other things. And I’m of course hugely grateful to everyone who stuck with me through the roughly three year process of this book getting finished.
Zach: 01:16:00 That’s what it turned into. And that’s why it became a much broader thing. This just comes back to scope creep which I’ve always been bad about. And I think that at the end of it all, I’m happy with the result. And as much as … it more clearly articulates my sensibilities about software in the conceptual framework that I struggled to put into words a the beginning of writing this book. What I’m less thrilled about is that, having taking this book, which was a fairly general book about software design, and having used Clojure as the example language, I think cuts down on the potential audience for the book. The people who would go and take the time to read through it. It cuts it down quite drastically.
Zach: 01:16:44 And so, I recently mentioned, on Twitter and on the mailing list for the book and other places, that I’ve been considering a book which is not exactly a rewrite of the book, but is maybe a spiritual sequel to it which will just be called, or at least tentatively called, Principles of Software Design, which will attempt to cover the same territory without the Clojure specific aspects. And that book will have tot be more general because Clojure has certain idioms the language just feeds you into, it makes it very hard to fall outside of.
Zach: 01:17:18 And so, there are a lot of questions about classic object oriented design with mutable objects that you don’t need to talk about in a book about Clojure, that you don’t need to go and articulate how these different ideas play with that style of software design. And so, that’s something that I need to think about a lot more, frankly. And be able to have that be something that fits into this conceptual framework built around specific use of Clojure.
Zach: 01:17:44 But I think that’s going to be a worthwhile and interesting. And my hope is that in the meantime, people who don’t use Clojure professionally or whatever else are able to look past the parens and take the more general lessons from my book. But I think that in order to really have the impact that I would like to have, it’s going to have to be a book that doesn’t mention Clojure and doesn’t use Lisp as its teaching language. Because I think that, from just a pedagogical perspective, Lisp is not a very friendly language. It’s something that people are not going to go and happily learn just for the purposes of reading the book.
Daniel: 01:18:25 You never know.
Zach: 01:18:26 It’s possible. I mean, SICP may be the counter example there, but I’m not Gerald Sussman, so, I think I have to be a little more humble in terms of what bridges people are willing to cross just to meet me on the other side.
Daniel: 01:18:40 Yeah. People switched to MX just for [inaudible 01:18:44]. So, yeah.
Zach: 01:18:45 It’s possible. I don’t know. I think that that will require people to be very, very effusive about the book. And so, if people want to start seeing the praises of Elements of Clojure, please go ahead. Prove me wrong. But I don’t want to predicate my expectations on that happening. I will say that some people have said some very kind words to me in private and publicly and of course that is very nice to hear, especially given the amount of time and energy that I put into this. But yeah, I think that it’s hard. A book will never matter as much to anyone as it does to you, the author. That’s just necessarily the case.
Zach: 01:19:25 And so, at this point I’m just trying to not get too presumptuous in terms of what I can and can’t expect from the broader audience of software engineers.
Daniel: 01:19:36 Yeah. Were there any recommendations that you made in the book which people disagreed about? That said, “Actually, I think this is not a good recommendation,” or … ?
Zach: 01:19:47 I tried to, in the cases where I made what I thought was being a somewhat overly broad statement, I tried to articulate a couple of cases where that advice didn’t apply. So, I use a very basic example. I say that variatic keyword params. So, like having a function where you have the ampersand map de structuring so that you can go and just add a bunch of keyword parameters to the function, shouldn’t be used. You should instead pass in an actual option map. And the reason for that is oftentimes the option map is being passed many layers down into the code. And having to go and put that back into a map and then de structure to go and call the next thing is actually fairly laborious. It makes the code more complex to read. In general we should just go and put things in a map and not try and go and have it be magically de structured just so we can remove one set of curly braces from our code.
Zach: 01:20:43 The case that I give of where that’s not the case though, is if you’re going and doing Macros. Because in Macros typically, you’re not going and calling many layers deep because it’s happening at compiled time, not at run time. And so, I think someone can read this and be like, “Well, I like keyword params. I like having that be how my code looks and everything. It seems cleaner.” And that’s fine.
Daniel: 01:21:06 They’re clearly wrong.
Zach: 01:21:09 But no one’s sent me an angry email saying, “How dare you?” So, I don’t know. I’m sure that there are many small pieces of advice that I give that people will happily ignore. And I think it’s important when you give that advice, which is overly broad, to go and articulate an example of something where this does not apply. And if you find a similar case, you should be free to not treat this as gospel. But I do think that for the most part where the advice is specific it’s not super controversial. And where the advice is broad I think it’s easy for someone to interpret it in whatever way they choose.
Zach: 01:21:47 So, I think that … hopefully there’s no deeply angry readers of this. I think that there’s some people that maybe were hoping for something that was much more concrete and specific. And to them I can just say, “I’m sorry. I don’t know how to write that book. I don’t think that the subject matter I was trying to cover allows for that much specificity. Because otherwise it just becomes a book about software in that specific domain.
Daniel: 01:22:13 Yeah. Yeah. I really enjoyed Elements of Clojure and I know, I’ve seen lots of companies in particular buy a copy for everyone in the team and share discussions about it and it’s built a shared context which previously was implicit. One part I really liked in particular, one example of it, was the quote, “Functions can do three things. Pulling your data into scope, transform data already in scope or push data into another scope.” Experienced Clojurists know that implicitly but it would be quite hard to put into words, potentially for many of them, they wouldn’t know exactly why they feel that some code is wrong.
Zach: 01:22:54 Right. Some function does too much for … That’s great to hear. Because I think that was very much the goal is, there’s a thing, you know it, you feel it, there’s a visceral feeling when you look at some sort of code and you know it’s not quite right. But when you’re talking to someone who wrote it, who might be a more junior engineer who you’re mentoring, who you’re trying to go and share your experience with, all you can say is, “That feels wrong.” Which is a dissatisfying way to go and try to mentor somebody is to just go and tell them, “No,” periodically.
Zach: 01:23:24 Ideally you can help them find their way to, “What is the broader principle at play here?” Rather than treat them as some sort of supervised machine learning model or something like that. So, that is very much the goal. So, I’m happy to hear that. And I’ve thankfully heard that from other people as well and everything. But there is, I think, just an interesting problem with a lot of industry books about software design. And in preparation for this second book I’ve just ordered a bunch of them. Every book I could find that seems to talk about software design, I ordered a used copy of it and have been paging through it.
Zach: 01:24:00 And there’s a term that gets thrown around a lot, which is heuristic. Which is … that’s what we use to go and come up with what’s good design in software. We have a heuristic which is the rule of three. Like if you go and you write code three times, then you can generalize it, but not before. And there are lots of things that you can do this. There are also contradictory heuristics. Like, you should right code to be deleted rather than modify. We’re not going to go and we’re not going to try to generalize it. We’re just going to go and write a thing and if that’s ever not useful we’ll just go and we’ll throw it away and we’ll write a new thing.
Zach: 01:24:37 I’m not a big believer in the dictionary definition as a motivating idea for a talk or for a design principle or whatever, but the etymology of heuristic is interesting in that it comes from the Greek Eureka, as in, “I’ve found it.” And I think that that’s what all heuristics are, they’re this intuitive leap into the void. It’s going to take time for us to go and walk our way back and figure out what is the actual principle at play, but we know that we found a thing. But we also know about heuristics, where it’s something that Gerald Weinberg observed in his Introductions to General Systems Thinking, is that heuristics don’t tell us when to stop. So, a heuristic is a bounded tool. It doesn’t apply universally. It’s meant to be applied within a particular context or scope. But no heuristic goes and describes what its scope is. It just makes itself out to be this universal truth.
Zach: 01:25:33 And that’s how you can get into a situation where you have contradictory heuristics, because they exist in bipartite scopes. But you don’t know what they are. And so, the problem with heuristics are that they require expert knowledge to apply properly. And that’s a bad place to be if you’re trying to go and write something which is an intermediate level book on software design, which is full of little nuggets of wisdom that you need to have already learned well enough to apply … in order to apply properly you already need to have outgrown the book effectively. It’s something that only really makes sense looking backwards once you’ve surpassed the text.
Zach: 01:26:14 And so, I think that that characterizes a lot of conversations I see in a day to day software shop where people are going and quoting different heuristics at each other, which may sport their point without any necessarily understanding of, “Is this applicable here?” Because that’s just this argument from authority of, “Well, you know what they say about the rule of three. You know what they say about never optimizing. You know what they say about never optimizing except in that important three percent.” There’s all these things that we use as a proxy for understanding this. And my hope is, from elements of Clojure and certainly in the new book that I’m writing, that it’s not just a collection of heuristics, it’s actually a collection of concepts and from those concepts the heuristics fall out, but what also falls out of that is an intuition for where they are meant to be applied. Like, what is the interrelationships between these things? Where is one applicable and where is the other applicable?
Zach: 01:27:09 I mean, I genuinely don’t know if I’m going to achieve the goal that I just described there, but I think that it is an absence in the literature today that I’ve recognized. And if not me, I think someone ought to go and fill that because I think that right now we’re not talking about software design in a particularly articulate way. We talk about names, we talk about abstractions, we talk about all these things without really defining what we mean, and a lot of it just comes down to who’s the loudest voice in the room. And that’s a bad place to be as an industry, I think. We are a relatively new field, but it seems like there’s a lot of room to do better there.
Zach: 01:27:52 So, that’s the goal. And I think that as far as career goals go, I think that being able to explain that more effectively is what’s motivating me right now. And that’s why when I was looking for a new job after I came off of this book sabbatical, I chose a job which is not a Clojure job because I thought that the most interesting software design problem I could find happened to be outside of that domain. And that makes me very sad. And I think that I had very much hoped and had held on to hope that I could make Clojure a home for myself, not least because people know me in this community and I get more leeway to go and try things out the way that I want to try them out and all that other stuff, but … and also just because, again, Clojure fits the way that I think.
Zach: 01:28:42 And it makes me happy to be able to express my ideas in that language. But ultimately, at this point, what matters more to me is being able to refine my ideas about software design in the general case, outside of any particular language by going and working on what I consider to be the mot interesting problem. And the media software design problem. And so, yeah, I announced recently that I’m going to work at Microsoft on the semantic machines team which is doing natural language processing. And they are a Scala shop. And so, I am now the proud owner of the Martin Odersky Scala book, which I’m fully making my way through.
Zach: 01:29:22 And it’s fine. It doesn’t fill me with that same excitement as Clojure did at the outset and it’s possible that I’ll never be as excited about a language as I was about Clojure. Maybe that’s just something that happens at some phase in your career or at some age or whatever. But it’s a sad thing. And I think that part of why I was as vocal as I was last year was because I was realizing that. I was realizing that the jobs … I was looking at the job landscape and what I could work on and I realized that there were so many opportunities that took me away from Clojure. And the opportunities that would keep me within Clojure were not as exciting to me. The problems were not as exciting to me.
Zach: 01:30:04 And that’s not a absolute truth that’s just going to continue on to the future indefinitely, but I think that I struggle to articulate what are the driving forces that are going to reverse that trend. I don’t know what those are. And I think that certainly I am not the person to reverse them. So, that’s where I find myself.
Daniel: 01:30:27 Yeah. I felt, when I saw your announcement, there was … I was happy for you, I was also sad. Sad for myself and for the community that … I mean, we’re not losing you, it’s not like you’re slicing yourself off completely from the Clojure community but I imagine you’ll have less time and attention for Clojure stuff in the future.
Zach: 01:30:44 I think so. In fairness I’ve been a pretty absentee maintainer of my Clojure open source libraries for a number of years now because I’ve just … my attention has been elsewhere. And the goal [inaudible 01:30:56] those libraries always was to learn more about software design. And so, the problem with that is that once people start to use the library, if they’re not coming to me regularly with, “This is a bad design,” it’s actually less interesting. A successful library is less interesting then a library which is unsuccessful for some meaningful reason.
Zach: 01:31:15 And so, that’s a little bit tricky, speaking to my own motivations. But yeah, I’m definitely not trying to pretend Clojure doesn’t exist or pretend that I don’t have this deep affection and connection to the language and to the community around it. It’s just that it’s hard for me to imagine how professionally I’m going to be able to be a Clojure programmer. Because I’ve realized that what motivates me is dealing with problems that get to the essence of what is good software design. And I just don’t know that that’s best answered by continuing to be a full time Clojure developer.
Zach: 01:31:54 And again, I hope someone proves me wrong here. I hope that if the time comes for me to go look for my next job, I look around and the landscape is just chock full of really deep fundamentally interesting design problems that are all implemented in Clojure. But it’s something where I just have to go and take stock and say that I’ve spent roughly 10 years of my life optimizing for Clojure above all else. And I think that that’s not something that I can justify indefinitely. And so, it’s very bittersweet. I’m excited about this role, I’m excited about things I’m going to learn, I’m excited about the people that I’ll be working with, some of whom are former Clojure people. Jason Wolfe, formally of Prismatic, author of the Schema library, is there. And is in fact the reason that I was aware of this job at all.
Zach: 01:32:44 But I think that it’s something where I’ve just reached a point where Clojure cannot be the overwriting consideration here, for better or for worse.
Daniel: 01:32:55 Yeah. I’m excited to see what you come up with at Microsoft with … so semantic machines and conversational AI chat box sounds a little bit reductive but …
Zach: 01:33:08 That’s how I describe it to annoy Jason, basically. Yeah, it is reductive. Not least because that calls up the extremely limited conversations that you’ll have with Alexa or whatever. And so, the hope here is that what it is is going to be significantly more. And so, maybe you’ll reclaim that term as a point of pride at some point. But for the time being yeah, it’s conversational AI or conversational UI depending on how you want to think about the domain of the application.
Daniel: 01:33:39 Well, people can buy a copy of Elements of Clojure on elementsofclojure.com, there’s print book and ebook available. That was pretty recent, the print book. And you’re on Twitter, ztellman, GitHub, ztellman, probably many other social media websites @ztellman. So, yeah, I’m going to say thanks for all of the work and effort and time you’ve put into the Clojure community and 10 years is a long time and you’ve made a really, really lasting impact on Clojure.
Zach: 01:34:16 Well, and I want to say the same to you. The work that you’ve done with Clojurists Together and other things like that, it’s been extremely time consuming to go and herd the community in a particular direction and everything like this is, I think, very thinkless work. And you’ve been doing it for a while now and I am a proud contributor to Clojurists Together and will be for at least as long as people are buying the book, because why wouldn’t I go and hold that back into the community? And so, everyone who’s listening to this, I encourage you, I believe its clojuriststogether.org.
Daniel: 01:34:53 That’s right.
Zach: 01:34:54 Am I getting that right?
Daniel: 01:34:54 Yeah.
Zach: 01:34:55 If you want to prove me wrong here, if you want to go and say that there is a vibrant future for Clojure that I’m just too pessimistic to see or something like this, one of the ways you can go and contribute to that and to make that true is to contribute to Clojurists Together to be an active voice about what needs to be supported, what needs to be improved. There are still many, many things that can be done, I think, to improve the ecosystem to make it more friendly to beginners to spur growth.
Zach: 01:35:29 And so, I don’t want to have people walk away from this thinking that this is somehow hopeless. I’m only speaking to the trends that I see. And trends are absolutely reversible.
Daniel: 01:35:40 Great. Well, yeah. Thanks again and I’m sure I will be seeing a lot more of you and your work in the future.
Zach: 01:35:47 Well, thanks for having me on. Yeah.